## Convert time format

Our next step in processing our data is making our dates readable. 

For example:

1880.5 = 1880 + ½ of a year = 1880 + 6 months = 06/1880 or June 1880

**Our goal**: Convert the date column into separate columns for year and month

### Step 1: Do it for one date 

**Some useful functions**

numpy.ceil(**x**) : Returns the number x rounded to the closest and largest whole number
* **x**: number or array

numpy.floor(**x**) : Returns the number x rounded to the closest and smallest whole number
* **x**: number or array

numpy.round(**x**, **decimels**) : Returns the rounded number x or rounds all elements in array
* **x**: number or array
* **decimels**: number of decimels to round to

numpy_array.astype(**data type**) : Converts elements in array to a different data type
* **data type**: int, float, string etc

In [1]:
import numpy as np

In [2]:
#Useful functions

test = 2.5
print('number:',test)
print('np.ceil(number):', np.ceil(2.5))
print('np.floor(number):', np.floor(2.5))
print('np.round(number):', np.round(2.5,0))

test_array = np.arange(2.75,3.25,0.05)
print('\n array:',test_array)
print('np.ceil(array):', np.ceil(test_array))
print('np.floor(array):', np.floor(test_array))
print('np.round(array):', np.round(test_array,0))

print('\n array with elements as int:',test_array.astype(int))

number: 2.5
np.ceil(number): 3.0
np.floor(number): 2.0
np.round(number): 2.0

 array: [2.75 2.8  2.85 2.9  2.95 3.   3.05 3.1  3.15 3.2 ]
np.ceil(array): [3. 3. 3. 3. 3. 3. 4. 4. 4. 4.]
np.floor(array): [2. 2. 2. 2. 2. 2. 3. 3. 3. 3.]
np.round(array): [3. 3. 3. 3. 3. 3. 3. 3. 3. 3.]

 array with elements as int: [2 2 2 2 2 2 3 3 3 3]


Let's start small. Find the year and month associated with the given test date.

In [3]:
date = 1880.5
print(date)
year = np.floor(date)
print('np.floor(year):', np.floor(date))
#What do these variables equal?
#month = 

1880.5
np.floor(year): 1880.0


### Step 2: Do it for an array of dates 

Remember: you can run operations on numpy arrays like they are numbers

For example:

```A = [1 , 2, 3]```

```A * 2 = [2 , 4, 6]```

In [7]:
date_array = np.arange(1880, 1882.05, 1/12)
#print(date_array)
#What do these arrays equal?
year_array = np.round(date_array,3)
year_df_floor= np.floor(year_array)
date_array_floor= np.floor(date_array)
month_array =np.ceil(12*(date_array-date_array_floor)+1)
print(year_array)
print(month_array)

[1880.    1880.083 1880.167 1880.25  1880.333 1880.417 1880.5   1880.583
 1880.667 1880.75  1880.833 1880.917 1881.    1881.083 1881.167 1881.25
 1881.333 1881.417 1881.5   1881.583 1881.667 1881.75  1881.833 1881.917
 1882.   ]
[ 1.  2.  3.  4.  5.  6.  7.  8.  9. 10. 11. 12. 13.  2.  3.  4.  5.  6.
  7.  8.  9. 10. 11. 12. 13.]


### Step 3: Do this for a Pandas data column

Remember: pandas columns work exactly like numpy arrays

In [8]:
import pandas as pd

In [9]:
date_df = pd.DataFrame({'Date':date_array,'Data':np.random.rand(len(date_array))})
date_df.head()
#What do these columns equal?
year_array= np.floor(year_df_floor)
date_df['Year'] = year_array
date_df['Month']= month_array
#date_df['Month'] =
date_df.head()

Unnamed: 0,Date,Data,Year,Month
0,1880.0,0.124892,1880.0,1.0
1,1880.083333,0.037051,1880.0,2.0
2,1880.166667,0.621888,1880.0,3.0
3,1880.25,0.306508,1880.0,4.0
4,1880.333333,0.046806,1880.0,5.0


### Step 4: Write a function

Write a function that takes any Pandas column with dates in the this format and creates a new dataframe with columns for year and month instead.  

Discuss in groups about what will go into this skeleton for a function and write your pseudo-code in your lab notes.

```def (function_inputs):
    do something
    return function_outputs```
        

In [10]:
#Function goes here
def convert(df):
    year_array = np.round(date_array,3)
    year_df_floor= np.floor(year_array)
    month_array =np.ceil(12*(date_array-date_array_floor)+1)
    date_df['Year'] = year_array
    date_df['Month']= month_array
    return df

In [12]:
print(convert(date_df))
date_df.head

           Date      Data      Year  Month
0   1880.000000  0.124892  1880.000    1.0
1   1880.083333  0.037051  1880.083    2.0
2   1880.166667  0.621888  1880.167    3.0
3   1880.250000  0.306508  1880.250    4.0
4   1880.333333  0.046806  1880.333    5.0
5   1880.416667  0.211522  1880.417    6.0
6   1880.500000  0.953217  1880.500    7.0
7   1880.583333  0.776854  1880.583    8.0
8   1880.666667  0.296391  1880.667    9.0
9   1880.750000  0.354870  1880.750   10.0
10  1880.833333  0.047125  1880.833   11.0
11  1880.916667  0.813626  1880.917   12.0
12  1881.000000  0.650763  1881.000   13.0
13  1881.083333  0.595699  1881.083    2.0
14  1881.166667  0.868120  1881.167    3.0
15  1881.250000  0.463495  1881.250    4.0
16  1881.333333  0.918701  1881.333    5.0
17  1881.416667  0.030259  1881.417    6.0
18  1881.500000  0.514246  1881.500    7.0
19  1881.583333  0.662728  1881.583    8.0
20  1881.666667  0.668793  1881.667    9.0
21  1881.750000  0.736099  1881.750   10.0
22  1881.83

<bound method NDFrame.head of            Date      Data      Year  Month
0   1880.000000  0.124892  1880.000    1.0
1   1880.083333  0.037051  1880.083    2.0
2   1880.166667  0.621888  1880.167    3.0
3   1880.250000  0.306508  1880.250    4.0
4   1880.333333  0.046806  1880.333    5.0
5   1880.416667  0.211522  1880.417    6.0
6   1880.500000  0.953217  1880.500    7.0
7   1880.583333  0.776854  1880.583    8.0
8   1880.666667  0.296391  1880.667    9.0
9   1880.750000  0.354870  1880.750   10.0
10  1880.833333  0.047125  1880.833   11.0
11  1880.916667  0.813626  1880.917   12.0
12  1881.000000  0.650763  1881.000   13.0
13  1881.083333  0.595699  1881.083    2.0
14  1881.166667  0.868120  1881.167    3.0
15  1881.250000  0.463495  1881.250    4.0
16  1881.333333  0.918701  1881.333    5.0
17  1881.416667  0.030259  1881.417    6.0
18  1881.500000  0.514246  1881.500    7.0
19  1881.583333  0.662728  1881.583    8.0
20  1881.666667  0.668793  1881.667    9.0
21  1881.750000  0.73609

In [None]:
#Test on date_df
def addition()