# Final Project for Applied Physics in Programming
By: Eveliina Hampus, DIN22SP

1. Import necessary libraries. 

Remember to install libraries with %pip install or properly via CLI with pip install.
Notice that Oceanborn is depentant from Matplotlib.

In [121]:
import numpy as np # For calculations
import matplotlib.pyplot as plt # For basic plotting
import pandas as pd # For data manipulation
import sympy as sm # For symbolic calculations
import scipy as sc # For scientific processing
import matplotlib as mpl # To support oceanborn libary
import seaborn as sns # For more informative plots, requires matplotlib to be imported first
from scipy.signal import butter,filtfilt # For filtering
from math import radians, cos, sin, asin, sqrt #
from sympy.abc import x #

2. Read the data using pandas dataframes.

You may to explore your data with few methods to get basic overview of your data. Use these with a variable assigned to your dataframe.

.head() prints out first 5 lines of the data, whilst .tail() does same for last 5 lines.
.info() prints out information of datatypes in the data.
.describe() shows basic statistics of the data.

In [122]:
gps_data = pd.read_csv('GPS_data.csv') # Read the GPS data from the csv file
acc_data = pd.read_csv('Acceleration_data.csv') # Read the acceleration data from the csv file

In [123]:
gps_data.head() # Show the first 5 rows of the GPS data

Unnamed: 0.1,Unnamed: 0,seconds_elapsed,longitude,latitude
0,0,3.703666,25.516165,65.071375
1,1,5.339879,25.516227,65.071393
2,2,5.68,25.51624,65.071398
3,3,6.311351,25.516264,65.071394
4,4,7.323929,25.516267,65.071396


In [124]:
acc_data.head() # Show the first 5 rows of the acceleration data

Unnamed: 0.1,Unnamed: 0,seconds_elapsed,z,y,x
0,0,0.14344,-2.846008,-0.581977,-0.468246
1,1,0.159217,-2.651466,-0.560432,-0.418557
2,2,0.175117,-2.132024,-0.366186,-0.53599
3,3,0.190986,-1.02335,-0.041039,-0.522485
4,4,0.206763,-0.449766,0.229015,-0.35692


3. Haversine formula

In [125]:
#define haversine formula
#inputs are the coordinates of two points: lon1, lat1, lon2, lat2

def haversine(lon1, lat1, lon2, lat2):
    #convert degrees to radians
    lon1, lat1, lon2, lat2 = map(radians, [lon1, lat1, lon2, lat2])
    #calculate differences
    dlon = lon2 - lon1
    dlat = lat2 - lat1
    #apply haversine formula
    a = sin(dlat/2)**2 + cos(lat1) * cos(lat2) * sin(dlon/2)**2
    c = 2 * asin(sqrt(a))
    #earth radius in kilometers
    r = 6371
    #calculate the result
    return c * r

4. Calculate velocity from GPS data.

In [126]:
lat = gps_data['latitude'] #latitude
lon = gps_data['longitude'] #longitude

gps_data['dist'] = np.zeros(lat.shape[0]) #create a new column for distance
gps_data['time_diff'] = np.zeros(lat.shape[0]) #create a new column for time difference

#calculate distance and time difference between points
for i in range(lat.shape[0]-1):
    gps_data['dist'][i] = haversine(lon[i], lat[i], lon[i+1], lat[i+1])*1000 #distance in meters
    gps_data['time_diff'][i] = gps_data['seconds_elapsed'][i+1] - gps_data['seconds_elapsed'][i]

gps_data['dist'][0] = 0 #set the first distance to zero
gps_data['velocity'] = gps_data['dist']/gps_data['time_diff'] #calculate velocity

#Print velocity
gps_data['velocity']

You are setting values through chained assignment. Currently this works in certain cases, but when using Copy-on-Write (which will become the default behaviour in pandas 3.0) this will never work to update the original DataFrame or Series, because the intermediate object on which we are setting values will behave as a copy.
A typical example is when you are setting values in a column of a DataFrame, like:

df["col"][row_indexer] = value

Use `df.loc[row_indexer, "col"] = values` instead, to perform the assignment in a single step and ensure this keeps updating the original `df`.

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy

  gps_data['dist'][i] = haversine(lon[i], lat[i], lon[i+1], lat[i+1])*1000 #distance in meters
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#ret

0      0.000000
1      2.382686
2      1.855274
3      0.226845
4      0.717928
         ...   
217    1.745572
218    1.882024
219    0.925273
220    1.851229
221         NaN
Name: velocity, Length: 222, dtype: float64

Calculate total distance.

In [127]:
# calculate the total distance traveled
total_distance = gps_data['dist'].sum()
print('Total distance traveled: ', total_distance, 'm')

Total distance traveled:  215.14092455483325 m


Calculate total time.

In [128]:
# calculate the total time elapsed
total_time = gps_data['seconds_elapsed'].iloc[-1] - gps_data['seconds_elapsed'].iloc[0]
print('Total time elapsed: ', total_time, 's')

Total time elapsed:  129.99008081054686 s


Calculate average speed.

In [129]:
# calculate average speed
average_speed = total_distance / total_time
print('Average speed: ', average_speed, 'm/s')

Average speed:  1.6550564721041208 m/s


Introducing filters for noisy data.

In [130]:
def butter_lowpass_filter(data, cutoff, fs, nyq, order):
    normal_cutoff = cutoff / nyq
    b, a = butter(order, normal_cutoff, btype='low', analog=False)
    y = filtfilt(b, a, data)
    return y

def butter_highpass_filter(data, cutoff, fs, nyq, order):
    normal_cutoff = cutoff / nyq
    b, a = butter(order, normal_cutoff, btype='high', analog=False)
    y = filtfilt(b, a, data)
    return y

#Filter the data
noisy_signal = acc_data['z']
time = acc_data['seconds_elapsed']

order = 2
dt = (time[len(time)-1] - time[0])/len(time)

fs = 1/dt
nyq = 0.5 * fs
cutoff_H = 1/5 #Highpass
cutoff_L = 1/0.5 #Lowpass

#Remember to define filters
lowpass_filtered = butter_lowpass_filter(noisy_signal, cutoff_L, fs, nyq, order)
final_signal = butter_highpass_filter(lowpass_filtered, cutoff_H, fs, nyq, order)


Calculate steps taken.

In [131]:
stepdata = final_signal
steps = 0
for i in range(stepdata.shape[0]-1):
    if stepdata[i]/stepdata[i+1] < 0:
        steps = steps + 0.5

print('The number of steps is', steps)

The number of steps is 255.5
