## **Knowing our data**

In [2]:
import pandas as pd
import numpy as np

In [4]:
data = 'https://s3-us-west-2.amazonaws.com/streamlit-demo-data/uber-raw-data-sep14.csv.gz'
df = pd.read_csv(data, nrows=500)
df.head()

Unnamed: 0,Date/Time,Lat,Lon,Base
0,9/1/2014 0:01:00,40.2201,-74.0021,B02512
1,9/1/2014 0:01:00,40.75,-74.0027,B02512
2,9/1/2014 0:03:00,40.7559,-73.9864,B02512
3,9/1/2014 0:06:00,40.745,-73.9889,B02512
4,9/1/2014 0:11:00,40.8145,-73.9444,B02512


## **Putting the column names in a lower case**
To avoid mistakes


In [5]:
lower_str = lambda x: str(x).lower() 
df.rename(lower_str, axis='columns', inplace=True)
df.head()

Unnamed: 0,date/time,lat,lon,base
0,9/1/2014 0:01:00,40.2201,-74.0021,B02512
1,9/1/2014 0:01:00,40.75,-74.0027,B02512
2,9/1/2014 0:03:00,40.7559,-73.9864,B02512
3,9/1/2014 0:06:00,40.745,-73.9889,B02512
4,9/1/2014 0:11:00,40.8145,-73.9444,B02512


As you can see the column names are in lower case


## **Checking if dates are on datetime**
To access only the hours of our column `date/time`.
We have to make sure that this column is in datetime

In [7]:
df.dtypes

date/time     object
lat          float64
lon          float64
base          object
dtype: object

As we can see the column is an object type, so we will convert it to datetime

In [11]:
df['date/time'] = pd.to_datetime(df['date/time'])

In [9]:
df.dtypes

date/time    datetime64[ns]
lat                 float64
lon                 float64
base                 object
dtype: object

Now our column is of the datetime type.

**Note that using** `df['date/time']` **is the same as using** `df.date/time`.
**However in this case, we cannot use** `df.date/time` **due to '/'**
