# Loading a csv file in Pandas

```
import pandas as pd
df = pd.read_csv('file_name.csv', 
                    parse_dates = ['date_col1' , 'date_col2']) // parsing date while loading df

```

```
df['date_col'] = pd.to_datetime(df['date_col'],
                format = "%Y-%m-%d %H:%M:%S") // if automatically parsing fails
```

# Making timedelta columns

```
duration = df['End date'] - df['Start date']

rides['Duration'] = duration.dt.total_seconds() // get total seconds from timedelta
```

# Time Stats

```
df['Duration'].median() // median

```

# Resampling

```
import matplotlib.pyplot as plt

df.resample('D', on = 'date_col')\ // 'D' for daily 'M' for monthly
  .size()\     // in order to plot resample date you need len of time (size)
  .plot(ylim = [0, 15])

plt.show()
```

# Find length of a resample type

size counts the number of elements in an array

```
df_resample.size()
```

# Combining groupby() and resample()

```
grouped = df.groupby('Col1')\
  .resample('M', on = 'Date_Col')
```

# Timezones in Pandas

- use `tz_localize()` to set a timezone
- use `tz_convert()` to convert/ shift to a timezone

```
df['date_col'] = df['date_col'].dt.tz_localize('America/New_York', 
                                						 ambiguous='NaT') // set timezone to see if it is ambiguous

df['date_col'] = df['date_col'].dt.tz_convert('Europe/London') // convert to another timezone
```

# Day name and others

```
df['date_col'] = df['date_col'].dt.day_name()
```

# Shifting time index

```
df['col_name'] = df['col_name'].shift(1) // Shift the index up one

df['col_name'] = df['col_name'].dt.total_seconds() // total seconds
```