<img src="http://imgur.com/1ZcRyrc.png" style="float: left; margin: 20px; height: 55px">

# Timeseries and Datetime

---

## Learning Objectives

### Core
- Use the datetime library to represent dates as objects
- Know how to calculate time differences with timedelta
- Use datetime objects as index with pandas DataFrames/Series


<h1>Lesson Guide<span class="tocSkip"></span></h1>
<div class="toc"><ul class="toc-item"><li><span><a href="#Timeseries-and-Datetime" data-toc-modified-id="Timeseries-and-Datetime-1">Timeseries and Datetime</a></span><ul class="toc-item"><li><span><a href="#Learning-Objectives" data-toc-modified-id="Learning-Objectives-1.1">Learning Objectives</a></span><ul class="toc-item"><li><span><a href="#Core" data-toc-modified-id="Core-1.1.1">Core</a></span></li></ul></li><li><span><a href="#The-datetime-library" data-toc-modified-id="The-datetime-library-1.2">The <code>datetime</code> library</a></span></li><li><span><a href="#datetime-object" data-toc-modified-id="datetime-object-1.3"><code>datetime</code> object</a></span><ul class="toc-item"><li><span><a href="#Let's-set-a-random-datetime." data-toc-modified-id="Let's-set-a-random-datetime.-1.3.1">Let's set a random datetime.</a></span></li><li><span><a href="#The-components-of-the-date-are-accessible-via-attributes-of-the-object." data-toc-modified-id="The-components-of-the-date-are-accessible-via-attributes-of-the-object.-1.3.2">The components of the date are accessible via attributes of the object.</a></span></li></ul></li><li><span><a href="#timedelta" data-toc-modified-id="timedelta-1.4"><code>timedelta</code></a></span></li><li><span><a href="#Load-the-UFO-reports-data" data-toc-modified-id="Load-the-UFO-reports-data-1.5">Load the UFO reports data</a></span></li><li><span><a href="#Pandas'-pd.datetime" data-toc-modified-id="Pandas'-pd.datetime-1.6">Pandas' <code>pd.datetime</code></a></span><ul class="toc-item"><li><span><a href="#The-.dt-attribute" data-toc-modified-id="The-.dt-attribute-1.6.1">The <code>.dt</code> attribute</a></span></li></ul></li><li><span><a href="#Time-stamps" data-toc-modified-id="Time-stamps-1.7">Time stamps</a></span></li><li><span><a href="#Additional-resources" data-toc-modified-id="Additional-resources-1.8">Additional resources</a></span></li></ul></li></ul></div>

<a id="the-datetime-library"></a>
## The `datetime` library
---

The python library `datetime` is great for dealing with time-related data. Pandas has incorporated this `datetime` library into its own datetime series and objects.

We're going to review these data types and learn a little more about them.
- Datetime Object
- Datetime Series
- Time Stamp
- Time Delta


<a id="datetime-object"></a>
## `datetime` object
---

Below we can load in the datetime library. Using this we can create a datetime object by entering the different components of the date as arguments.

In [1]:
from datetime import datetime

### Let's set a random datetime.

In [2]:
lesson_date = datetime(2012, 12, 21, 12, 21, 12, 844089)

### The components of the date are accessible via attributes of the object.

In [3]:
print("Micro-Second", lesson_date.microsecond)
print("Second", lesson_date.second)
print("Minute", lesson_date.minute)
print("Hour", lesson_date.hour)
print("Day", lesson_date.day)
print("Month",lesson_date.month)
print("Year", lesson_date.year)

Micro-Second 844089
Second 12
Minute 21
Hour 12
Day 21
Month 12
Year 2012


## `timedelta`


Say we want to add time to a date or subtract time.  Maybe we are using time as an index and we want to get everything that happened a week before a specific observation.

We can use a timedelta object to shift (calculating with dates, more or less) a datetime object. Here's an example:


In [4]:
# Import timedelta from datetime library
from datetime import timedelta

Time deltas represent durations rather than dates.

In [5]:
offset = timedelta(days=1, seconds=20)

In [6]:
offset

datetime.timedelta(1, 20)

The time delta has attributes that allow us to extract values from it.

In [7]:
print('offset days', offset.days)
print('offset seconds', offset.seconds)
print('offset microseconds', offset.microseconds)

offset days 1
offset seconds 20
offset microseconds 0


The `.now()` function of datetime will give you the datetime object of this very moment.

In [8]:
now = datetime.now()
print("Like Right Now: ", now)

Like Right Now:  2018-08-19 13:00:49.723092


The current time is particularly useful when using timedeltas.

In [9]:
print("Future: ", now + offset)
print("Past: ", now - offset)

Future:  2018-08-20 13:01:09.723092
Past:  2018-08-18 13:00:29.723092


> Note: The largest value a Time Delta object can hold is 'Days',  i.e. you can't say you want an offset to be 2 years, 44 days and 12 hours.  You would have to manually convert the time of those years to be represented in days.

You can read more about that here in the [timedeltas category](
https://docs.python.org/2/library/datetime.html).

<a id="load-the-ufo-reports-data"></a>
## Load the UFO reports data
---

We can practice using datetime functions and objects with the UFO reports data.

In [10]:
import pandas as pd

In [11]:
ufo = pd.read_csv('../../../../resource-datasets/ufo_sightings/ufo.csv')

In [12]:
ufo.shape

(80543, 5)

In [13]:
ufo.head()

Unnamed: 0,City,Colors Reported,Shape Reported,State,Time
0,Ithaca,,TRIANGLE,NY,6/1/1930 22:00
1,Willingboro,,OTHER,NJ,6/30/1930 20:00
2,Holyoke,,OVAL,CO,2/15/1931 14:00
3,Abilene,,DISK,KS,6/1/1931 13:00
4,New York Worlds Fair,,LIGHT,NY,4/18/1933 19:00


We can see that the Time column is of object type.

In [14]:
ufo.dtypes

City               object
Colors Reported    object
Shape Reported     object
State              object
Time               object
dtype: object

## Pandas' `pd.datetime`


When using pandas we can convert columns of data from string objects into date objects with the `pd.to_datetime` function.

> **Note**: Dates can be tricky to parse as they come in many formats. The `to_datetime` function comes with a keyword argument `infer_datetime_format` that can be particularly useful to parse dates.

Overwrite the original Time column with one 
that has been converted to a datetime series:

In [15]:
ufo['Time'] = pd.to_datetime(ufo.Time)

In [16]:
ufo.Time.head()

0   1930-06-01 22:00:00
1   1930-06-30 20:00:00
2   1931-02-15 14:00:00
3   1931-06-01 13:00:00
4   1933-04-18 19:00:00
Name: Time, dtype: datetime64[ns]

Letting pandas guess how to do this can take a little bit of time we can use a few arguments to help.

In [17]:
ufo['Time_guided'] = pd.to_datetime(ufo.Time, format='%Y%m%d', errors='coerce')

With `format` we let pandas know what format pandas should use to interpret the date,
`errors` will allow you to automatically deal with errors when converting.

In [18]:
ufo.describe()

Unnamed: 0,City,Colors Reported,Shape Reported,State,Time,Time_guided
count,80496,17034,72141,80543,80543,80543
unique,13504,31,27,52,68901,68901
top,Seattle,ORANGE,LIGHT,CA,2014-07-04 22:00:00,2014-07-04 22:00:00
freq,646,5216,16332,10743,45,45
first,,,,,1930-06-01 22:00:00,1930-06-01 22:00:00
last,,,,,2014-09-05 05:30:00,2014-09-05 05:30:00


In [19]:
ufo.Time_guided.sort_values().unique()

array(['1930-06-01T22:00:00.000000000', '1930-06-30T20:00:00.000000000',
       '1931-02-15T14:00:00.000000000', ...,
       '2014-09-05T02:40:00.000000000', '2014-09-05T03:43:00.000000000',
       '2014-09-05T05:30:00.000000000'], dtype='datetime64[ns]')

We've had a little bit of change to the time columns structure.

In [20]:
ufo.head()

Unnamed: 0,City,Colors Reported,Shape Reported,State,Time,Time_guided
0,Ithaca,,TRIANGLE,NY,1930-06-01 22:00:00,1930-06-01 22:00:00
1,Willingboro,,OTHER,NJ,1930-06-30 20:00:00,1930-06-30 20:00:00
2,Holyoke,,OVAL,CO,1931-02-15 14:00:00,1931-02-15 14:00:00
3,Abilene,,DISK,KS,1931-06-01 13:00:00,1931-06-01 13:00:00
4,New York Worlds Fair,,LIGHT,NY,1933-04-18 19:00:00,1933-04-18 19:00:00


We can see the Time object has changed.  

In [21]:
ufo.dtypes

City                       object
Colors Reported            object
Shape Reported             object
State                      object
Time               datetime64[ns]
Time_guided        datetime64[ns]
dtype: object

<a id="the-dt-attribute"></a>
### The `.dt` attribute

Pandas datetime columns have a `.dt` attribute that allows you to access attributes specific to the dates. For example:
```python
ufo.Time.dt.day
ufo.Time.dt.month
ufo.Time.dt.year
ufo.Time.dt.weekday_name
```

And many more.

In [22]:
ufo.Time.dt.weekday_name.head()

0     Sunday
1     Monday
2     Sunday
3     Monday
4    Tuesday
Name: Time, dtype: object

In [23]:
ufo.Time.dt.dayofyear.head()

0    152
1    181
2     46
3    152
4    108
Name: Time, dtype: int64

## Time stamps

Timestamps are useful objects for comparisons. You can create a timestamp object with the `pd.to_datetime` function and a string specifying the date. These timestamps are useful when you need to do logical filtering with dates.

In [24]:
# Time Stamp
ts = pd.to_datetime('1/1/1999')
ts

Timestamp('1999-01-01 00:00:00')

The main difference between a DateTime object and a Timestamp is that Timestamps can be used for comparisons.

In [25]:
ufo.shape

(80543, 6)

Use that Time Stamp for a comparison:

In [26]:
ufo.loc[ufo.Time >= ts, :].shape

(67711, 6)

We can even get the first and last dates from a timeseries:

In [27]:
ufo.Time.max() - ufo.Time.min()

Timedelta('30776 days 07:30:00')

I'd imagine months and years are not consistent in length and like weeks, 
who cares about weeks!  They're just seven days.

## Additional resources

- See more information about pandas Datetime [here](http://pandas.pydata.org/pandas-docs/stable/timeseries.html).
- For the datetime module, see [here](https://docs.python.org/3/library/datetime.html#strftime-and-strptime-behavior).