# Cross-Sectional Data

<img src="./assets/cross_sectional_data.png" width=400></img>

Data collected at a single point in time for each individual is called **cross-sectional data**.

# Time-Series/Longitudinal Data

<img src="./assets/time_series_data.png" width=400></img>

Data on the same variables collected at multiple time points from the same individuals is called **longitudinal** for **time-series** data.

# Time Series Decomposition -- Basic Concepts

Google searches for "data science" show a strong increasing trend over the last five years:

![](./assets/data_science.png)

Google searches for "gingerbread house" show strong annual seasonality:

![](./assets/gingerbread_house.png)

Google searches for "iphone" show both trend and seasonality:

![](./assets/iphone.png)

# Pandas Timestamps

In [1]:
import pandas as pd

In [2]:
ufo = pd.read_csv('http://bit.ly/uforeports')
ufo.head()

Unnamed: 0,City,Colors Reported,Shape Reported,State,Time
0,Ithaca,,TRIANGLE,NY,6/1/1930 22:00
1,Willingboro,,OTHER,NJ,6/30/1930 20:00
2,Holyoke,,OVAL,CO,2/15/1931 14:00
3,Abilene,,DISK,KS,6/1/1931 13:00
4,New York Worlds Fair,,LIGHT,NY,4/18/1933 19:00


In [None]:
ufo.loc[:, 'Time'].apply(type)

In [6]:
# Convert times from str to pandas Timestamp objects
ufo.loc[:, 'Time'] = pd.to_datetime(ufo.loc[:,'Time'])

In [7]:
ufo.loc[:, 'Time'].apply(type)

0        <class 'pandas._libs.tslib.Timestamp'>
1        <class 'pandas._libs.tslib.Timestamp'>
2        <class 'pandas._libs.tslib.Timestamp'>
3        <class 'pandas._libs.tslib.Timestamp'>
4        <class 'pandas._libs.tslib.Timestamp'>
5        <class 'pandas._libs.tslib.Timestamp'>
6        <class 'pandas._libs.tslib.Timestamp'>
7        <class 'pandas._libs.tslib.Timestamp'>
8        <class 'pandas._libs.tslib.Timestamp'>
9        <class 'pandas._libs.tslib.Timestamp'>
10       <class 'pandas._libs.tslib.Timestamp'>
11       <class 'pandas._libs.tslib.Timestamp'>
12       <class 'pandas._libs.tslib.Timestamp'>
13       <class 'pandas._libs.tslib.Timestamp'>
14       <class 'pandas._libs.tslib.Timestamp'>
15       <class 'pandas._libs.tslib.Timestamp'>
16       <class 'pandas._libs.tslib.Timestamp'>
17       <class 'pandas._libs.tslib.Timestamp'>
18       <class 'pandas._libs.tslib.Timestamp'>
19       <class 'pandas._libs.tslib.Timestamp'>
20       <class 'pandas._libs.tslib.Time

In [8]:
ufo.loc[:, 'Time'].apply(lambda timestamp: timestamp.month)

0         6
1         6
2         2
3         6
4         4
5         9
6         6
7         7
8        10
9         6
10        8
11        6
12        6
13        7
14        6
15        7
16        2
17        6
18        7
19        4
20        6
21        8
22        8
23       10
24        1
25        1
26        1
27        4
28        6
29        6
         ..
18211    12
18212    12
18213    12
18214    12
18215    12
18216    12
18217    12
18218    12
18219    12
18220    12
18221    12
18222    12
18223    12
18224    12
18225    12
18226    12
18227    12
18228    12
18229    12
18230    12
18231    12
18232    12
18233    12
18234    12
18235    12
18236    12
18237    12
18238    12
18239    12
18240    12
Name: Time, Length: 18241, dtype: int64

In [10]:
# Set times as index
ufo.set_index('Time', inplace=True)

In [11]:
# Get all data from 1936
ufo.loc['1936', :]

Unnamed: 0_level_0,City,Colors Reported,Shape Reported,State
Time,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
1936-07-15 00:00:00,Alma,,DISK,MI
1936-10-15 17:00:00,Eklutna,,CIGAR,AK


In [None]:
ufo.index.month --6, :