## Convert time format

Our next step in processing our data is making our dates readable. 

For example:

1880.5 = 1880 + ½ of a year = 1880 + 6 months = 06/1880 or June 1880

**Our goal**: Convert the date column into separate columns for year and month

### Step 1: Do it for one date 

**Some useful functions**

numpy.ceil(**x**) : Returns the number x rounded to the closest and largest whole number
* **x**: number or array

numpy.floor(**x**) : Returns the number x rounded to the closest and smallest whole number
* **x**: number or array

numpy.round(**x**, **decimels**) : Returns the rounded number x or rounds all elements in array
* **x**: number or array
* **decimels**: number of decimels to round to

numpy_array.astype(**data type**) : Converts elements in array to a different data type
* **data type**: int, float, string etc

In [4]:
import numpy as np

In [5]:
#Useful functions

test = 2.5
print('number:',test)
print('np.ceil(number):', np.ceil(2.5))
print('np.floor(number):', np.floor(2.5))
print('np.round(number):', np.round(2.5,0))

test_array = np.arange(2.75,3.25,0.05)
print('\n array:',test_array)
print('np.ceil(array):', np.ceil(test_array))
print('np.floor(array):', np.floor(test_array))
print('np.round(array):', np.round(test_array,0))

print('\n array with elements as int:',test_array.astype(int))

number: 2.5
np.ceil(number): 3.0
np.floor(number): 2.0
np.round(number): 2.0

 array: [2.75 2.8  2.85 2.9  2.95 3.   3.05 3.1  3.15 3.2 ]
np.ceil(array): [3. 3. 3. 3. 3. 3. 4. 4. 4. 4.]
np.floor(array): [2. 2. 2. 2. 2. 2. 3. 3. 3. 3.]
np.round(array): [3. 3. 3. 3. 3. 3. 3. 3. 3. 3.]

 array with elements as int: [2 2 2 2 2 2 3 3 3 3]


Let's start small. Find the year and month associated with the given test date.

In [6]:
date = 1880.5
year= np.floor(date)
print(date)
print('year:', np.floor(date))
print('month:', 12*(date- year))
#What do these variables equal?
#year = 1880
#month = 6 (june)

1880.5
year: 1880.0
month: 6.0


### Step 2: Do it for an array of dates 

Remember: you can run operations on numpy arrays like they are numbers

For example:

```A = [1 , 2, 3]```

```A * 2 = [2 , 4, 6]```

In [13]:
date_array = np.arange(1880, 1882.05, 1/12)
print(date_array)
rounded_year_array= np.floor(date_array)
month_array= 12*(date_array-rounded_year_array)+1
#What do these arrays equal?
#year_array = 
#month_array =
print(rounded_year_array)
print(month_array)

[1880.         1880.08333333 1880.16666667 1880.25       1880.33333333
 1880.41666667 1880.5        1880.58333333 1880.66666667 1880.75
 1880.83333333 1880.91666667 1881.         1881.08333333 1881.16666667
 1881.25       1881.33333333 1881.41666667 1881.5        1881.58333333
 1881.66666667 1881.75       1881.83333333 1881.91666667 1882.        ]
[1880. 1880. 1880. 1880. 1880. 1880. 1880. 1880. 1880. 1880. 1880. 1880.
 1880. 1881. 1881. 1881. 1881. 1881. 1881. 1881. 1881. 1881. 1881. 1881.
 1881.]
[ 1.  2.  3.  4.  5.  6.  7.  8.  9. 10. 11. 12. 13.  2.  3.  4.  5.  6.
  7.  8.  9. 10. 11. 12. 13.]


### Step 3: Do this for a Pandas data column

Remember: pandas columns work exactly like numpy arrays

In [8]:
import pandas as pd

In [14]:
date_df = pd.DataFrame({'Date':date_array,'Data':np.random.rand(len(date_array))})
date_df.head()
rounded_year= np.round(date_df['Date'],3)
rounded_year.head()
year_df_floor= np.floor(rounded_year)
year_df_floor.head()
month= np.ceil(12*(date_df['Date']- year_df_floor)+1)
print(month)
#date_df['Year']
#What do these columns equal?
#date_df['Year'] = 
#date_df['Month'] =

0      1.0
1      2.0
2      3.0
3      4.0
4      5.0
5      6.0
6      7.0
7      8.0
8      9.0
9     10.0
10    11.0
11    12.0
12     1.0
13     2.0
14     3.0
15     4.0
16     5.0
17     6.0
18     7.0
19     8.0
20     9.0
21    10.0
22    11.0
23    12.0
24     1.0
Name: Date, dtype: float64


In [15]:
date_df.head()

Unnamed: 0,Date,Data
0,1880.0,0.354796
1,1880.083333,0.622559
2,1880.166667,0.899229
3,1880.25,0.456384
4,1880.333333,0.433702


### Step 4: Write a function

Write a function that takes any Pandas column with dates in the this format and creates a new dataframe with columns for year and month instead.  

Discuss in groups about what will go into this skeleton for a function and write your pseudo-code in your lab notes.

```def (function_inputs):
    do something
    return function_outputs```
        

In [17]:
#Function goes here

def cvrt_datetoyearmonth (x):
    
    year_df_rounded = np.round(date_df['Date'],3)
    print('\n' , year_df_rounded.head())
    
    year_df_floor = np.floor(year_df_rounded)
    print('\n' , year_df_floor.head())
    
    month = np.ceil(12*(date['Date']-year_df_floor)+1)
    print('\n' , month.head())
    
    return month
    
    

In [None]:
#Test on date_df