## Exercise 01

Read the dataset of the previous class, for a variable named `df`, using the function [`read_csv()`](https://pandas.pydata.org/docs/reference/api/pandas.read_csv.html) and show its content.

In [2]:
import pandas as pd
import numpy as np

# Load the csv data (from 'data/track.csv') and see what we get.

As you can see, the index is the usual one: a sequential number from $0$ to $n-1$ (for a csv with $n$ rows of data). Note that `read_csv()` was smart and detected the name of the columns in the first row of the file. It's not always like that...

## Exercise 02

Inspect the returned dataframe, using the `info()` method.

The columns relating to latitude, longitude and elevation were interpreted as float64 (numbers with floating comma, with double precision). The `time` and `place` columns are objects.

## Exercise 03

What is the type of `df['lat']` and `df.time`?

## Exercise 04

What are the values and respective types in the first position of `df.lat` and `df['time']`?

As you can verify, the column `time` ended up generating a series with objects of the type `str`, that is, strings. Which is in not good if we want to do operations with hours and dates...

## Exercise 05

Looking at the documentation of the function [`read_csv()`](https://pandas.pydata.org/docs/reference/api/pandas.read_csv.html), see if you can make the `time` column interpreted as a date with time. Use again the `info()` method to inspect the type of the elements of each column of the data frame.

## Exercise 06

Use the [`head()`](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.head.html) method to show only the first 10 lines of the data frame.

## Exercise 07

By the way, it shows the last 5 lines of the data frame. 

**Tip**: look in the documentation not for head but for tail ...

## Exercise 08

What is the type of values returned by the `head()` and `tail()` methods?

## Exercise 09

What is the time elapsed (in seconds) between the samples of indexes 0 and 100, of the route?

**Tip**: Note that the type of elements in the `time` column is [`pandas.Timestamp`](https://pandas.pydata.org/pandas-docs/stable/user_guide/timeseries.html). It can also be useful to look at the class [`pandas.Timedelta`](https://pandas.pydata.org/docs/reference/api/pandas.Timedelta.html#pandas-timedelta)

First compute the difference bettween the two timestamps; Then use the method `total_seconds` to compute the delta in seconds.

## Exercise 10

What is the total duration of the route (in seconds)?

**Tip:** compute the last index, using the `len` function.

## Exercise 11

What is the average GPS sampling rate (in samples per second)?

**Top**: You need to get the total number of lines of a [`DataFrame`](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html).

## Exercise 12

Our data frame has, so far, a numerical index. In this set of data, the data refers to a time series. We can imagine that each sample could be indexed by the date/time at which it was obtained. Let's go back to using the `read_csv()` function so that the index is the `time` column. Show the result dataframe.

**Tip:** check the argument `parse_dates`.

## Exercise 13

Get the samples obtained between 10:00 and 10:05. The day of the track is 2012-05-26.

**Note:** Do not forget that the data is a dateTime

## Exercise 14

Add a column (`delta_ele`), in the data frame, with the elevation difference to the previous point. Note that this series does not have a defined value for the first sample.

**Tip**: Check the documentation about [`diff`](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.diff.html) method. Notice that `diff` method can be used to compute the difference between. You can also check a related method, [`shift`](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.shift.html).

## Exercise 15

The function `haversine(lon1, lat1, lon2, lat2)` accepts 4 series, with the locations of pairs of points and produces a series with the distances between the two points of each pair. 
Call the function with 2 series of the original data frame (`df`), and the two of the original data frame shifted.

Notice that this `haversine` uses some mathematical functions from `numpy` that work on nparrays (in this case the series)

**Tip**: Check the method, [`shift`](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.shift.html).

In [1]:
import numpy as np

def haversine(lon1, lat1, lon2, lat2):
    
    lon1, lat1, lon2, lat2 = map(np.radians, [lon1, lat1, lon2, lat2])
    
    dlat = lat2 - lat1
    dlon = lon2 - lon1
    
    a = np.sin(dlat/2)**2 + np.cos(lat1) * np.cos(lat2) * np.sin(dlon/2)**2
    r = 6371
    return 2*np.arcsin(np.sqrt(a))*r

#print(haversine(.....))

## Exercise 16

Use the result of the previous call to add a new `dist` column to the dataframe `df`. 

## Exercise 17

Show the graph with the profile (altimetry) of the route, as a function of the time. Check the documentation on [`plot`](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.plot.html) method of dataframe.

## Exercise 18

Show a graph with the profile of the route (Altimetry) as a function of the accumulated distance (not in function of the time as in the previous exercise)

**Tip:** Add a new column with the distance traveled (accumulated). Consult the documentation of the method [`cumsum`](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.cumsum.html)

## Exercise 19

Add a new column (`delta_t`) to the data frame with the time elapsed since the previous instant sample.

**Alternative 1:** add a column with the time index values and then apply the `diff` method to that column

**Alternative 2:** create a series from the current index
- `df['delta_t'] = pd.Series( df.index.values[1:] - df.index.values[:-1], index = df.index[1:])`


## Exercise 20

Calculate a series with the time differences (in seconds) between two consecutive samples.Take a look at documentation about the class [`pandas.Timedelta`](https://pandas.pydata.org/docs/reference/api/pandas.Timedelta.html#pandas-timedelta)

## Exercise 21

Show a graph of speed over time.

## Exercise 22

Build a DataFrame for the route section between 10:00 and 11:00

Show a graph of the elevation for this section of the route, function of the total distance traveled.

## Exercise 23

Show the distance traveled between "Marker#1" and "Marker#2"