In this notebook, we're going to talk about the various methods that people use to keep track of ***timestamps***. A timestamp refers to the time at which a measurement was taken during a series of events.


# Important: Run this code cell each time you start a new session!

In [None]:
!pip install numpy
!pip install pandas
!pip install matplotlib
!pip install ipywidgets
!pip install os
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import os

In [None]:
!wget -Ncnp https://physionet.org/files/accelerometry-walk-climb-drive/1.0.0/raw_accelerometry_data/id00b70b13.csv

In [None]:
df = pd.read_csv('id00b70b13.csv')

# Filter to only walking activity, which is given a code of 1
df = df[df['activity'] == 1]

# Process the time
df.rename(columns={'time_s': 'Time'}, inplace=True)
df = df[(df['Time']>=700) & (df['Time']<=710)]
df['Time'] = df['Time'] - df['Time'].min()

# Process the accel
df['Accel'] = np.sqrt(df['la_x']**2 + df['la_y']**2 + df['la_z']**2)*9.8

# Keep only crucial columns
keep_cols = ['Time', 'Accel']
df = df[keep_cols]
df.to_csv('walking.csv',index=False)

# Option 1: `ints` and `floats`

The simplest way of including a timestamp is by using integers or floats to represent a unit of time like milliseconds, minutes, or hours. In some cases, the sequence starts from 0 and increments until measurements stop being recorded. The `Time` column of the following time series does just that, showing how much time has passed since the start of the recording in seconds:

In [None]:
df = pd.read_csv('walking.csv')
df.head(10)

In [None]:
plt.figure(figsize=(5,3))
plt.plot(df['Time'], df['Accel'])
plt.xlabel('Time (s)')
plt.ylabel('Accelerometer (m/s^2)')
plt.show()

There may be situations where your times do not start at 0, but rather another value (e.g., the current [Unix time](https://www.epochconverter.com/)). If this is the case, you can easily adjust the start of your timestamps by subtracting the lowest timestamp value from all of the entries:

In [None]:
df['Time'] = df['Time'] - df['Time'].min()
df.head(10)

# Option 2: `datetime`

For situations when it is important to keep track of the exact real-world date when a measurement was taken, one option is to use Python's built-in `datetime` object.

In [None]:
# Get the current time
from datetime import datetime
datetime.now()

In [None]:
# Create a timestamp for a specific date and time
datetime(2022, 9, 1, 12, 0, 0)  # September 1, 2022, 12:00 PM

This data structure allows you to easily extract the specific characteristics of a given date, such as the year, minute, or even microsecond:



In [None]:
current_time = datetime.now()
print(f"Current year: {current_time.year}")
print(f"Current minute: {current_time.minute}")
print(f"Current microsecond: {current_time.microsecond}")

If the horizontal axis involves `datetime` objects, the `plot()` function in `matplotlib` is good at automatically recognizing them as dates and displaying them as such.

In [None]:
# Manipulate the timestamps so they start at the current time
# No need to worry about the syntax
from datetime import timedelta
df = pd.read_csv('walking.csv')
current_time = datetime.now()
df['Date'] = df['Time'].apply(lambda t: current_time + timedelta(seconds=t))
df.head(10)

In [None]:
plt.figure(figsize=(5,3))
plt.plot(df['Date'], df['Accel'], 'k-')
plt.xlabel('Date')
plt.ylabel('Accelerometer (m/s^2)')
plt.xticks(rotation=45)
plt.show()

# Option 3: `Timestamp` in `pandas`

`pandas` provides powerful functionality to automatically infer dates from strings. Dates are technically converted into a bespoke data type in `pandas` called `Timestamp`, but is functionally similar to the `datetime` object in Python.

In the example below, we create a `DataFrame` using our own formatted string. By using the `to_datetime()` function, `pandas` automatically infers which parts of the string correspond to the year, month, etc.

In [None]:
# Check the current data type of each column
frame_dict = {'Date': ['Jan-01-2023 12:00', 'Jan-02-2023 1:00', 'Jan-03-2023 2:00', 'Jan-04-2023 3:00'],
              'Value': [45, 34, 23, 12]}
df = pd.DataFrame(frame_dict)
df.info()

In [None]:
# Convert the data type of the Date column to a datetime
df['Date'] = pd.to_datetime(df['Date'])
df.info()

In [None]:
plt.figure(figsize=(5,3))
plt.plot(df['Date'], df['Value'], 'ko')
plt.xlabel('Date')
plt.ylabel('Accelerometer (m/s^2)')
plt.xticks(rotation=45)
plt.show()

When you load data from a .csv file, you can automatically suggest that certain columns are parsed as dates by setting the `infer_datetime_format` argument to `True` and specifying the columns that should be parsed using the `parse_dates` argument

In [None]:
df = pd.read_csv('walking.csv', parse_dates=['Time'], infer_datetime_format=True)
df.info()

In this case, using this functionality isn't going to do anything because using a single `int` as a human-interpretable date is underspecified. However, this technique will work for most columns that contain readily identifiable dates.