<font color="white">.</font> | <font color="white">.</font> | <font color="white">.</font>
-- | -- | --
![NASA](http://www.nasa.gov/sites/all/themes/custom/nasatwo/images/nasa-logo.svg) | <h1><font size="+3">ASTG Python Courses</font></h1> | ![NASA](https://www.nccs.nasa.gov/sites/default/files/NCCS_Logo_0.png)

---

<CENTER>
<H1 style="color:red">
datetime Module
</H1>
</CENTER>

In [None]:
from __future__ import print_function

## <font color='red'>What will be Covered?</font>

<OL>
<LI> Times
<LI> Dates
<LI> timedeltas
<LI> Date Arithmetic
<LI> Comparing Values
<LI> Combining Dates and Times
<LI> Formating and Parsing
<LI> Time Series with Pandas
</OL>

## <font color='red'>Reference Documents</font>

- <A HREF="http://pleac.sourceforge.net/pleac_python/datesandtimes.html">Dates and Times</A>
- <A HREF="http://www.marinamele.com/2014/03/13-useful-tips-about-python-datetime.html">Usedul Tips about Python datetime Objects</A>
- <A HREF="https://pymotw.com/3/datetime/">datetime - Date and Time Value Manipulation</A>
- <a href="http://earthpy.org/pandas-basics.html">Time series analysis with pandas</a>
- <a href="https://jakevdp.github.io/PythonDataScienceHandbook/03.11-working-with-time-series.html">Working with Time Series</a>

## <font color='red'>What is the `datetime` Module?</font>

- The `datetime` module supplies classes to work with date and time. 
- These classes provide a number of functions to deal with dates, times and time intervals.  
- Date and datetime are an object in Python, so when you manipulate them, you are actually manipulating objects and not string or timestamps.

The `datetime` module provides high-level interface classes:

- `date`: An idealized date that assumes the Gregorian calendar extends infinitely into the future and past. It stores the year, month, and day as attributes.
- `time`: An idealized time that assumes there are 86,400 seconds per day with no leap seconds. This object stores the hour, minute, second, microsecond, and tzinfo (time zone information).
- `datetime`: A combination of a date and a time. It has all the attributes of both classes.
- `timedelta`: A duration expressing the difference between two date, time, or datetime objects to microsecond resolution.
- `tzinfo`: Provides time zone information objects.
- `timezone`: A class that implements the `tzinfo` abstract base class as a fixed offset from the UTC.

In this presentation, we will focus on the first four classes.

In [None]:
import datetime
import numpy as np
import pandas as pd

## <font color='red'>Times</font>

Time values are represented with the <B>time</B> class. Times have attributes for hour, minute, second, and microsecond. 

```python
datetime.time(hour=0, minute=0, second=0, microsecond=0, tzinfo=None, *, fold=0)
```

In [None]:
t = datetime.time(1, 2, 3)
print (t)
print ('hour  :', t.hour)
print ('minute:', t.minute)
print ('second:', t.second)
print ('microsecond:', t.microsecond)
print ('tzinfo:', t.tzinfo)

The variable <B>t</B> only holds values of time, and not a date associated with the time.
<P>
You can get the valid range of times in a single day:

In [None]:
print ('Earliest  :', datetime.time.min)
print ('Latest    :', datetime.time.max)
print ('Resolution:', datetime.time.resolution)

Note that the resolution for time is limited to whole microseconds.

## <font color='red'>Dates</font>

Calendar date values are represented with the **date** class.

It is easy to create a date representing today’s date using the **today()** class method.
    
```python
datetime.date(year, month, day)
```

In [None]:
today = datetime.date.today()
print ('today:  ', today)
print ('ctime:  ', today.ctime())
print ('tuple:  ', today.timetuple())
print ('ordinal:', today.toordinal())
print ('Year:   ', today.year)
print ('Mon:    ', today.month)
print ('Day :   ', today.day)

There are also class methods for creating instances from integers (using proleptic Gregorian ordinal values, which starts counting from Jan. 1 of the year 1) or POSIX timestamp values.

The following example illustrates the different value types used by:

- `fromordinal()`: Return the date corresponding to the proleptic Gregorian ordinal, where January 1 of year 1 has ordinal 1.
- `fromtimestamp()`: Return the local date corresponding to the timestamp.

In [None]:
import time

o = 733114
print ('o = {} and fromordinal(o) = {}:'.format(o, datetime.date.fromordinal(o)))

t = time.time()
print ('t = {} and fromtimestamp(t) = {}'.format(t, datetime.date.fromtimestamp(t)))

We can also determine the range of date values:

In [None]:
print ('Earliest  :', datetime.date.min)
print ('Latest    :', datetime.date.max)
print ('Resolution:', datetime.date.resolution)

Note too that the resolution for dates is a whole day.
<P>
Another way to create new date instances uses the <B>replace()</B> method of an existing date. For example, you can change the year, leaving the day and month alone.

In [None]:
d1 = datetime.date(2008, 3, 12)
print ('d1:', d1)

d2 = d1.replace(year=2009)
print ('d2:', d2)

## <font color='red'>timedeltas</font>

We can use <B>datetime</B> to perform basic arithmetic on date values via the <B>timedelta</B> class.

```python
datetime.timedelta(days=0, seconds=0, microseconds=0, 
                   milliseconds=0, minutes=0, hours=0, weeks=0)
```

In [None]:
print ("microseconds:", datetime.timedelta(microseconds=1))
print ("milliseconds:", datetime.timedelta(milliseconds=1))
print ("seconds     :", datetime.timedelta(seconds=1))
print ("minutes     :", datetime.timedelta(minutes=1))
print ("hours       :", datetime.timedelta(hours=1))
print ("days        :", datetime.timedelta(days=1))
print ("weeks       :", datetime.timedelta(weeks=1))

## <font color='red'>Date Arithmetic</font>

Date math uses the standard arithmetic operators. The following example with date objects illustrates using <B>timedelta</B> objects to compute new dates, and subtracting date instances to produce timedeltas (including a negative delta value).

In [None]:
today = datetime.date.today()
print ('Today    :', today)

one_day = datetime.timedelta(days=1)
print ('One day  :', one_day)

yesterday = today - one_day
print ('Yesterday:', yesterday)

tomorrow = today + one_day
print ('Tomorrow :', tomorrow)

print ('tomorrow - yesterday:', tomorrow - yesterday)
print ('yesterday - tomorrow:', yesterday - tomorrow)

## <font color='red'>Comparing Values</font>

Both date and time values can be compared using the standard operators to determine which is earlier or later.

In [None]:
print ('Times:')
t1 = datetime.time(12, 55, 0)
print ('\tt1:', t1)
t2 = datetime.time(13, 5, 0)
print ('\tt2:', t2)
print ('\tt1 < t2:', t1 < t2)

print ('Dates:')
d1 = datetime.date.today()
print ('\td1:', d1)
d2 = datetime.date.today() + datetime.timedelta(days=1)
print ('\td2:', d2)
print ('\td1 > d2:', d1 > d2)

## <font color='red'>Combining Dates and Times</font>

We can use the datetime class to hold values consisting of both date and time components. 

```python
datetime.datetime(year, month, day, 
                  hour=0, minute=0, second=0, microsecond=0, tzinfo=None, *, fold=0)
```

In [None]:
print('Now    :', datetime.datetime.now())
print('Today  :', datetime.datetime.today())
print('UTC Now:', datetime.datetime.utcnow())

d = datetime.datetime.now()
for attr in [ 'year', 'month', 'day', 'hour', 'minute', 'second', 'microsecond']:
    print (attr, ':', getattr(d, attr))

In [None]:
NOW = datetime.datetime.now()
 
print( "Current date & time =         %s " %NOW)
print( "Date and time in ISO format = %s" %NOW.isoformat())
print( "Current year =                %s " %NOW.year)
print( "Current month =               %s " %NOW.month)
print( "Current date (day) =          %s " %NOW.day)
print( "dd/mm/yyyy format =           %s/%s/%s" % (NOW.day, NOW.month, NOW.year))
print( "Current hour =                %s " %NOW.hour)
print( "Current minute =              %s " %NOW.minute)
print( "Current second =              %s" %NOW.second)
print( "hh:mm:ss format =             %s:%s:%s" % (NOW.hour, NOW.month, NOW.second))

* Just as with date, datetime provides convenient class methods for creating new instances. It also includes `fromordinal()` and `fromtimestamp()`. 
* In addition, `combine()` can be useful if you already have a date instance and time instance and want to create a datetime.

In [None]:
t = datetime.time(1, 2, 3)
print ('t :', t)

d = datetime.date.today()
print ('d :', d)

dt = datetime.datetime.combine(d, t)
print ('dt:', dt)

## <font color='red'>Formating and Parsing</font>

* The default string representation of a datetime object uses the format: YYYY-MM-DDTHH:MM:SS.mmmmmm 
* Alternate formats can be generated using `strftime()` function. 
        - Uses different control code to give an output.
        - Each control code resembles different parameters like year,month, weekday and date.
* If your input data includes timestamp values parsable with `time.strptime()`, then `datetime.strptime()` is a convenient way to convert them to datetime instances.

**Useful `strptime` and `strftime` Patterns**

|Directive | Meaning |
| --- | --- |
| `%a` | Weekday as locale's abbreviated name |
| `%A` | Weekday as locale's full name |
| `%w` | Weekday as decimal number, where 0 is Sunday and 6 is Saturday |
| `%d` | Day of the month as a zero-padded decimal number |
| `%b` | Month as locale's abbreviated name |
| `%B` | Weekday as locale's full name |
| `%m` | Month as zero-padded decimal number |
| `%y` | Year without century as a zero-padded decimal number name |
| `%Y` | Year with century as a decimal number |
| `%H` | Hour (24-hour clock) as a zero-padded decimal number |
| `%I` | Hour (12-hour clock) as a zero-padded decimal number |
| `%p` | Locale equivalent of either AM or PM |
| `%M` | Minute as a zero-padded decimal number |
| `%S` | Second as a zero-padded decimal number |
| `%f` | Microsecond as a zero-padded decimal number |
| `%j` | Day of the year as a zero-padded decimal number |
| `%W` | Week number of the year (Monday as the first day of the week) as a decimal number |
| `%U` | Week number of the year (Sunday as the first day of the week) as a decimal number |
| `%c` | Locale’s appropriate date and time representation |
| `%Z` | Time zone name |
| `%z` | UTC offset in the form HH[SS[.fffff]] |


**Formatting**

Weekday Month Day Hour:Minute:Second Year

In [None]:
format = "%a %b %d %H:%M:%S %Y"

today = datetime.datetime.today()
print ('ISO     :', today)

s = today.strftime(format)
print ('strftime:', s)

Obtain the time in HH:MM:SS

In [None]:
print(today.strftime("%X"))

Obtain the hour with 12 hours time

In [None]:
print(today.strftime("%I"))

Obtain AM or PM

In [None]:
print(today.strftime("%p"))

%c - local date and time, %x-local's date, %X- local's time

In [None]:
print("Date and Time =", today.strftime("%c"))
print("Date =         ", today.strftime("%x"))
print("Time =         ", today.strftime("%X"))

%I/%H - 12/24 Hour, %M - minute, %S - second, %p - local's AM/PM

In [None]:
print("Time =         ", today.strftime("%I:%M:%S %p")) # 12-Hour:Minute:Second:AM
print("Hour:Minutes = ", today.strftime("%H:%M")) # 24-Hour:Minute

**Parsing**

In [None]:
d = datetime.datetime.strptime(s, format)
print ('strptime:', d.strftime(format))

## <font color="red">Breakout</font>

On July 16, 1969, the huge, 363-feet tall Saturn V rocket launches on the Apollo 11 mission from Pad A, Launch Complex 39, Kennedy Space Center, at 9:32 a.m. EDT. Write a Python program the computes (**from now**) the number of:

* Years
* Months
* Days
* Hours
* Minutes
* Seconds

since the launch.

Hint: <A HREF="https://stackoverflow.com/questions/1345827/how-do-i-find-the-time-difference-between-two-datetime-objects-in-python"> Check this website</A>

### Sample Problem 1

Write a function (add_Years) that adds numYears to a date object.

#### Solution

In [None]:
def add_Years (myDate, numYears):
    return myDate.replace(year=myDate.year + numYears)

### Sample Problem 2

Write a function (add_Months) that adds numMonths to a date object.

#### Solution

In [None]:
import datetime

def add_Months(myDate, numMonths):
    m = myDate.month + numMonths
    year  = m // 12
    month = m % 12
    if month == 0:
       month = 12
       year -= 1
    try:
        newDate = datetime.date(myDate.year+year, month, myDate.day)
    except ValueError:
        import calendar
        # determine the total number of days in a new month
        m1, d1 = calendar.monthrange(myDate.year+year, month)
        newDate = datetime.date(myDate.year+year, month, d1)
    return newDate

### Sample Problem 3

Write a function (add_Days) that adds numDays to a date object.

#### Solution

In [None]:
def add_Days (myDate, numDays):
    return myDate + datetime.timedelta(days=numDays)

### Sample Problem 4

Write a function (increment_Date):
<UL> 
<LI> That has as arguments refDate (in the format YYYYMMDD), numYears, numMonths and numDays, and 
<LI> That adds numYears, numMonths and numDays to refDate.
<LI> That returns a new date in the format YYYYMMDD.
</UL>
<P>
Note that Years, Months and Days can be negative numbers.

#### Solution

In [None]:
import datetime

def increment_Date(Date, numYears=0, numMonths=0, numDays=0):
    # Extract the year, the month and day from Date
    y = Date // 10000
    m = (Date % 10000) // 100
    d = Date % 100
    
    # Determine the current date object
    curDate = datetime.date(y, m, d)
    
    # Increment the date
    curDate = add_Years (curDate, numYears)
    curDate = add_Months(curDate, numMonths)
    curDate = add_Days  (curDate, numDays)
    
    # Extract the new year, month and day
    curY = curDate.year
    curM = curDate.month
    curD = curDate.day
    
    # Compute the new date in the format YYYYMMDD
    newDate = 10000*curY + 100*curM + curD
    
    return newDate

print(increment_Date(20001231, numMonths=-15, numDays=7))

## <font color='red'>Pandas Datetime</font>
- Pandas provides a number to tools to handle times series data.
- Pandas datetime methods are used to work with datetime in Pandas.

Generate sequences of fixed-frequency dates and time spans:

In [None]:
dti = pd.date_range('2018-01-01', periods=15, freq='H')
print(type(dti))
dti

Use the sequence to create a Pandas series:

In [None]:
ts = pd.Series(range(len(dti)), index=dti)
print(ts)

Resample or convert the time series to a particular frequency:
- sample every two hours and compute the mean

In [None]:
ts.resample('2H').mean()

Create a Pandas series where the index is the time component:

In [None]:
num_periods = 67
ts = pd.Series(np.random.random(num_periods),
               index=pd.date_range('2000-01', periods=num_periods, freq='W'))
ts

Create a Pandas DataFrame where the index is the time component:

In [None]:
num_periods = 2500
df = pd.DataFrame(dict(X = np.random.random(num_periods), Y = -5+np.random.random(num_periods)),
                  index=pd.date_range('2000', periods=num_periods, freq='D'))
df

**Resampling**
- The `resample()` function is used to resample time-series data.
- It groups data by a certain time span. 
- You specify a method of how you would like to resample.
- Pandas comes with many in-built options for resampling, and you can even define your own methods.

Here are some time period options:

| Alias | Description |
| --- | --- |
| 'D' |	Calendar day |
| 'W' |	Weekly |
| 'M' |	Month end |
| 'Q' |	Quarter end |
| 'A' |	Year end |

Here are some method options for resampling:

| Method | Description |
| --- | --- |
| max |	Maximum value |
| mean |	Mean of values in time range |
| median |	Median of values in time range |
| min |	Minimum data value |
| sum |	Sum of values |


In [None]:
df.X.resample('Y').mean()

In [None]:
df.Y.resample('W').sum()

In [None]:
df.X.resample('Q').median()

### Example: Report on UFO Sighting

In [None]:
url = 'http://bit.ly/uforeports'
df_ufo = pd.read_csv(url)            
df_ufo 

Convert the `Time` column to datetime format:

In [None]:
df_ufo['Time'] = pd.to_datetime(df_ufo.Time)
df_ufo

Rename the column to `Date`:

In [None]:
df_ufo.rename(columns={'Time':'Date'}, inplace=True)
df_ufo

Move the `Date` column as the dataframe index:

In [None]:
df_ufo = df_ufo.set_index(['Date'])
df_ufo

**Question**: How to determine the number of sightings between two dates?

In [None]:
df1 = df_ufo.loc['1978-01-01 09:00:00':'1980-01-01 11:00:00']
df1

**Question**: How to extract the sightings at a specific month?

In [None]:
df2 = df_ufo[df_ufo.index.month == 2]
df2

**Question**: How to extract the sightings in a given State?

In [None]:
df3 = df_ufo[df_ufo['State']== 'CA']
df3

**Question**: How to count the number of sightings in each state?

In [None]:
df_ufo.groupby(['State']).count()

### Example: Temperatures (F) and Precipitation (inches) in July 2018 for Boulder, CO

In [None]:
url = "https://ndownloader.figshare.com/files/12948515"
df = pd.read_csv(url)            
df 

In [None]:
df.info()

In [None]:
df.describe()

In [None]:
df['date'] = pd.to_datetime(df.date)
df

In [None]:
df = df.set_index(['date'])
df

In [None]:
import numpy as np
df[df == -999.000] = np.nan
df

In [None]:
df = df.dropna()
df

In [None]:
ax = df.plot(kind='scatter', x='max_temp', y='precip');
ax.set_ylabel("Precipitation (inches)");
ax.set_xlabel("Temperature (F)");

In [None]:
df['precip'].plot()
ax = df.precip.plot(title="Daily Total Precipitation - Boulder Colorado in July 2018", 
                                figsize=(12,6));
ax.set_ylabel("Precipitation (inches)");

### Example: Arctic Oscillation and North Atlantic Oscillation  Datasets
- The <a href="https://en.wikipedia.org/wiki/Arctic_oscillation">Arctic oscillation (AO)</a> or Northern Annular Mode/Northern Hemisphere Annular Mode (NAM) is a weather phenomenon at the Arctic poles north of 20 degrees latitude. It is an important mode of climate variability for the Northern Hemisphere.
- The <a href="https://en.wikipedia.org/wiki/North_Atlantic_oscillation">North Atlantic Oscillation (NAO)</a> is a weather phenomenon in the North Atlantic Ocean of fluctuations in the difference of atmospheric pressure at sea level (SLP) between the Icelandic Low and the Azores High. 

In [None]:
ao_url = "http://www.cpc.ncep.noaa.gov/products/precip/CWlink/daily_ao_index/monthly.ao.index.b50.current.ascii"
nao_url = "http://www.cpc.ncep.noaa.gov/products/precip/CWlink/pna/norm.nao.monthly.b5001.current.ascii"

In [None]:
nao_df = pd.read_table(nao_url, sep='\s+', 
               parse_dates={'dates':[0, 1]}, header=None, index_col=0, squeeze=True)
nao_df

In [None]:
ao_df = pd.read_table(ao_url, sep='\s+', 
               parse_dates={'dates':[0, 1]}, header=None, index_col=0, squeeze=True)
ao_df

In [None]:
ao_df.columns = ['dates', 'AO']
ao_df

Create a Pandas DataFrame by combining the two Pandas Series. Note that the frequency of the data is one month (freq='M').

In [None]:
aonao_df = pd.DataFrame({'AO':ao_df.to_period(freq='M'), 'NAO':nao_df.to_period(freq='M')})
aonao_df

In [None]:
aonao_df.NAO.plot();

In [None]:
aonao_df.NAO['2010':'2019'].plot();

In [None]:
aonao_df.NAO['2010-02':'2010-11'].plot();

In [None]:
aonao_df.plot(subplots=True);

In [None]:
aonao_df.loc[(aonao_df.AO > 0) & (aonao_df.NAO < 0) 
        & (aonao_df.index > '2010-01') 
        & (aonao_df.index < '2020-01'), 'NAO'].plot(kind='barh');

#### Resampling

- Pandas provide easy way to resample data to different time frequency. 
- Two main parameters for resampling:
     1. Time period you resample to 
     2. The method that you use. By default the method is mean. 
     
In the example below we calculate the annual mean ("A").

In [None]:
aonao_df_mm = aonao_df.resample("A").mean()
aonao_df_mm.plot(style='g--', subplots=True);

In [None]:
aonao_df_mm = aonao_df.resample("A").median()
aonao_df_mm.plot(style='g--', subplots=True);

You can use your methods for resampling, for example `np.max` (in this case we change resampling frequency to 3 years):

In [None]:
aonao_df_mm = aonao_df.resample("3A").apply(np.max)
aonao_df_mm.plot(style='g--', subplots=True);

You can specify several functions at once as a list:

In [None]:
aonao_df_mm = aonao_df.NAO.resample("A").apply(['mean', np.min, np.max])
aonao_df_mm['1900':'2020'].plot(subplots=True);

#### Group By

Process that involves one or more of the steps:

- Splitting the data into groups based on some criteria.
- Applying a function to each group independently.
- Combining the results into a data structure.

Group by year:

In [None]:
aonao_df_gb_year = aonao_df.groupby(by=[aonao_df.index.year]).mean()
aonao_df_gb_year

In [None]:
aonao_df.groupby(pd.Grouper(freq='A')).mean()

Group by month:

In [None]:
aonao_df_gb_month = aonao_df.groupby(by=[aonao_df.index.month]).mean()
aonao_df_gb_month

In [None]:
aonao_df.groupby(pd.Grouper(freq='M')).mean()

Quaterly data:

In [None]:
aonao_df.groupby(pd.Grouper(freq='Q')).mean()