Today, I will be trying out the Parsing Dates Challenge....

* [Check the data type of our date column](#Check-the-data-type-of-our-date-column)
* [Convert our date columns to datetime](#Convert-our-date-columns-to-datetime)
* [Select just the day of the month from our column](#Select-just-the-day-of-the-month-from-our-column)
* [Plot the day of the month to check the date parsing](#Plot-the-day-of-the-month-to-the-date-parsing)

Go Go kit N kitty lets Go!

# Get our environment set up
________

First thing is to load the datasets by running the cells below;

In [1]:
# modules we'll use
import pandas as pd
import numpy as np
import seaborn as sns
import datetime

# read in our data
earthquakes = pd.read_csv("../input/earthquake-database/database.csv")
landslides = pd.read_csv("../input/landslide-events/catalog.csv")
volcanos = pd.read_csv("../input/volcanic-eruptions/database.csv")

# set seed for reproducibility
np.random.seed(0)

Then i will print first 5 headings to make sure we gat dates in there and then check the datatype


In [2]:
# now, check if the date rows are all dates.
print(earthquakes['Date'].head())
#Checking the data type of the Date column in the earthquakes dataframe
earthquakes['Date'].dtype


hmmmm.... I also have dtype as objects, well thanks to Rachael, I can always use the [numpy documentation](https://docs.scipy.org/doc/numpy-1.12.0/reference/generated/numpy.dtype.kind.html#numpy.dtype.kind) to understand  letter code to the dtype of object. 

Lets see if i can parse the Dates column succesfully this time around.

I will keep this information for future reference

(We can tell pandas what the format of our dates are with a guide called as ["strftime directive", which you can find more information on at this link](http://strftime.org/). The basic idea is that you need to point out which parts of the date are where and what punctuation is between them. There are [lots of possible parts of a date](http://strftime.org/), but the most common are `%d` for day, `%m` for month, `%y` for a two-digit year and `%Y` for a four digit year.

Some examples:

 * 1/17/07 has the format "%m/%d/%y"
 * 17-1-2007 has the format "%d-%m-%Y"
 
 Looking back up at the head of the `date` column in the landslides dataset, we can see that it's in the format "month/day/two-digit year", so we can use the same syntax as the first example to parse in our dates: )
 
 

Now that our dates are parsed correctly, we can interact with them in useful ways.

In [3]:
# Create a new column, date_parsed, in the earthquakes
earthquakes['date_parsed'] = pd.to_datetime(earthquakes['Date'], format = "%m/%d/%Y") 
# dataset that has correctly parsed dates in it. (Don't forget to 
earthquakes('date_parsed').head()
# double-check that the dtype is correct!)
earthquakes['date_parsed'].dtype


Oh No! I'm still getting error message , let's me try what https://www.kaggle.com/chrisbow did....

In [4]:
print (pd.to_datetime(earthquakes['Date'], errors = 'coerce', format="%m/%d/%Y"))
mask = pd.to_datetime(earthquakes['Date'], errors = 'coerce', format="%m/%d/%Y").isnull()
print (earthquakes['Date'][mask])

I've seen the enemy... Last 3 Rows have the %y/%m/%d formating

We got an error! The important part to look at here is the part at the very end that says `AttributeError: Can only use .dt accessor with datetimelike values`. We're getting this error because the dt.day() function doesn't know how to deal with a column with the dtype "object". Even though our dataframe has dates in it, because they haven't been parsed we can't interact with them in a useful way.

Luckily, we have a column that we parsed earlier , and that lets us get the day of the month out no problem:

In [5]:
earthquakes['date_Parsed'] = pd.to_datetime(earthquakes['Date'], infer_datetime_format = True)

In [6]:
earthquakes.date_Parsed.head()

He he he....I can see the data type

In [8]:
# now i can go on like a normal human being and get the day of the month from the date_parsed column
day_of_month_earthquakes = earthquakes['date_parsed'].dt.day

huh? does this means im not yet a human being?? ahh

In [10]:
# I see you small 'p'

In [9]:
# I have to try again, key error seems to be date_parsed
earthquakes['day_Month_earthquakes'] = earthquakes['date_Parsed'].dt.day
earthquakes['day_Month_earthquakes'].head()

Celebrations....  Rejoice Humans peeple, the Boov have arrived!

# Plot the day of the month to check the date parsing
I hear one of the biggest dangers in parsing dates is mixing up the months and days, so keeping this crucial hint:
The to_datetime() function does have very helpful error messages, but it doesn't hurt to double-check that the days of the month we've extracted make sense. 
To do this, let's plot a histogram of the days of the month. We expect it to have values between 1 and 31 and, since there's no reason to suppose the landslides are more common on some days of the month than others, a relatively even distribution. (With a dip on 31 because not all months have 31 days.) Let's see if that's the case:

In [13]:
# tremove na's
earthquakes.day_Month_earthquakes = earthquakes.day_Month_earthquakes.dropna()

# plot the day of the month
sns.distplot(earthquakes.day_Month_earthquakes, kde=False, bins=31)

Yep, it looks like we did parse our dates correctly & this graph makes good sense to me.

# More practice!
___

If you're interested in graphing time series, [check out this Learn tutorial](https://www.kaggle.com/residentmario/time-series-plotting-optional).

You can also look into passing columns that you know have dates in them to the `parse_dates` argument in `read_csv`. (The documention [is here](https://pandas.pydata.org/pandas-docs/stable/generated/pandas.read_csv.html).) Do note that this method can be very slow, but depending on your needs it may sometimes be handy to use.

For an extra challenge, you can try try parsing the column `Last Known Eruption` from the `volcanos` dataframe. This column contains a mixture of text ("Unknown") and years both before the common era (BCE, also known as BC) and in the common era (CE, also known as AD).

In [None]:
volcanos['Last Known Eruption'].sample(5)