# 08 Demo: Python Time
UW Geospatial Data Analysis  
CEE498/CEWA599  
David Shean  

## Introduction
* https://csit.kutztown.edu/~schwesin/fall20/csc223/lectures/Pandas_Time_Series.html
* Multiple options to represent datetime objects - easy to convert

### Python `datetime`
* Built-in module called `datetime` which contains classes for `datetime` object (and `timedelta` object) - can be confusing
* https://docs.python.org/3/library/datetime.html

### NumPy `datetime64`
* https://numpy.org/doc/stable/reference/arrays.datetime.html

### Pandas `Timestamp`
* https://pandas.pydata.org/docs/user_guide/timeseries.html
* https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Timestamp.html
* https://pandas.pydata.org/docs/user_guide/timeseries.html#overview
* `DatetimeIndex`
* `pd.to_datetime()`
    * Accepts "int, float, str, datetime, list, tuple, 1-d array, Series, DataFrame/dict-like"

### xarray
* https://xarray.pydata.org/en/stable/user-guide/time-series.html

### Day of calendar year
* January 1 = 1
* January 2 = 2
* December 31 = 365 

### Water year
* Starts October 1, ends September

### Time zones
* Let Pandas handle this
* You will inevitably get a warning about timezone aware vs. naive Timestamp objects
    * Add time zone: https://pandas.pydata.org/docs/reference/api/pandas.Timestamp.tz_localize.html
    * Remove time zone: https://stackoverflow.com/a/34687479
* General advice (time and timestamps are messy): https://www.youtube.com/watch?v=-5wpm-gesOY&amp;ab_channel=Computerphile 

## Discussion
* (t,x,y,z) records for one or more variables
* Pandas Timestamp vs. Python DateTime vs. Numpy.DateTime64
    * Some functions across different modules play nicely with one and not the other
* Dealing with missing values in DataFrame
    * Sometimes sensors fail or datalogger fails, sometimes values are flagged as erroneous
    * Pandas has excellent support for missing values: https://pandas.pydata.org/pandas-docs/stable/user_guide/missing_data.html
    * `dropna()` https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.dropna.html#pandas.DataFrame.dropna
* Trajectories
    * Argo floats (https://argo.ucsd.edu/)
    * Weather balloons
    * GNSS tracks - vehicles, pedestrians, aircraft
        * Spatial and temporal derivatives
* Permanent stations
    * Stream gage
    * SNOTEL sites
* How big is too big for Pandas/GeoPandas?
    * https://github.com/toddwschneider/nyc-taxi-data
    * PostgreSQL/PostGIS
        * SQL - Structured Query Language, used for managing data in a relational database
* What to do with multiple variables for each timestamp?
    * xarray works well for multiple variables (e.g., snow depth and SWE for same site) for each station for each time
        * https://docs.xarray.dev/en/stable/
    * Separate 2D dataframes
        * One storing locations of all sites
        * One storing time series of some variable for all sites
        * Common station ID as key

In [None]:
from datetime import datetime
import pandas as pd
import numpy as np

In [None]:
dt1 = datetime(2022, 2, 22)
dt2 = datetime.now()

In [None]:
print(dt2)

In [None]:
dt2.year

In [None]:
dt2.strftime('%Y%m%d')

In [None]:
dt2 - dt1

In [None]:
pd.to_datetime(dt2)

In [None]:
ts1 = pd.Timestamp('2019-02-01 12:00:00')

In [None]:
ts2 = pd.Timestamp('2019-02-06 00:00:00')

In [None]:
ts1

In [None]:
ts2

In [None]:
dt = ts2 - ts1

In [None]:
dt

In [None]:
ts1 + dt

In [None]:
ts2 + dt

In [None]:
ts1 - pd.Timedelta(days=1)

In [None]:
dt.total_seconds()