## Convert time format

Our next step in processing our data is making our dates readable. 

For example:

1880.5 = 1880 + ½ of a year = 1880 + 6 months = 06/1880 or June 1880

**Our goal**: Convert the date column into separate columns for year and month

### Step 1: Do it for one date 

**Some useful functions**

numpy.ceil(**x**) : Returns the number x rounded to the closest and largest whole number
* **x**: number or array

numpy.floor(**x**) : Returns the number x rounded to the closest and smallest whole number
* **x**: number or array

numpy.round(**x**, **decimels**) : Returns the rounded number x or rounds all elements in array
* **x**: number or array
* **decimels**: number of decimels to round to

numpy_array.astype(**data type**) : Converts elements in array to a different data type
* **data type**: int, float, string etc

In [1]:
import numpy as np

In [2]:
#Useful functions

test = 2.5
print('number:',test)
print('np.ceil(number):', np.ceil(2.5))
print('np.floor(number):', np.floor(2.5))
print('np.round(number):', np.round(2.5,0))

test_array = np.arange(2.75,3.25,0.05)
print('\n array:',test_array)
print('np.ceil(array):', np.ceil(test_array))
print('np.floor(array):', np.floor(test_array))
print('np.round(array):', np.round(test_array,0))

print('\n array with elements as int:',test_array.astype(int))

number: 2.5
np.ceil(number): 3.0
np.floor(number): 2.0
np.round(number): 2.0

 array: [2.75 2.8  2.85 2.9  2.95 3.   3.05 3.1  3.15 3.2 ]
np.ceil(array): [3. 3. 3. 3. 3. 3. 4. 4. 4. 4.]
np.floor(array): [2. 2. 2. 2. 2. 2. 3. 3. 3. 3.]
np.round(array): [3. 3. 3. 3. 3. 3. 3. 3. 3. 3.]

 array with elements as int: [2 2 2 2 2 2 3 3 3 3]


Let's start small. Find the year and month associated with the given test date.

In [3]:
date = 1880.5
#What do these variables equal?
year = np.floor(date)
month = 12*(date-year)
print(date)
print(year)
print(month)

1880.5
1880.0
6.0


### Step 2: Do it for an array of dates 

Remember: you can run operations on numpy arrays like they are numbers

For example:

```A = [1 , 2, 3]```

```A * 2 = [2 , 4, 6]```

In [4]:
date_array = np.arange(1880, 1882.05, 1/12)
print(date_array)

#What do these arrays equal?
year_array_rounded = np.round(date_array, 3)
print('\n', year_array_rounded)

year_array_floor = np.floor(year_array_rounded)
print('\n', year_array_floor)

month_array = np.ceil(12*(date_array-year_array_floor)+1)
print('\n', month_array)

[1880.         1880.08333333 1880.16666667 1880.25       1880.33333333
 1880.41666667 1880.5        1880.58333333 1880.66666667 1880.75
 1880.83333333 1880.91666667 1881.         1881.08333333 1881.16666667
 1881.25       1881.33333333 1881.41666667 1881.5        1881.58333333
 1881.66666667 1881.75       1881.83333333 1881.91666667 1882.        ]

 [1880.    1880.083 1880.167 1880.25  1880.333 1880.417 1880.5   1880.583
 1880.667 1880.75  1880.833 1880.917 1881.    1881.083 1881.167 1881.25
 1881.333 1881.417 1881.5   1881.583 1881.667 1881.75  1881.833 1881.917
 1882.   ]

 [1880. 1880. 1880. 1880. 1880. 1880. 1880. 1880. 1880. 1880. 1880. 1880.
 1881. 1881. 1881. 1881. 1881. 1881. 1881. 1881. 1881. 1881. 1881. 1881.
 1882.]

 [ 1.  2.  3.  4.  5.  6.  7.  8.  9. 10. 11. 12.  1.  2.  3.  4.  5.  6.
  7.  8.  9. 10. 11. 12.  1.]


### Step 3: Do this for a Pandas data column

Remember: pandas columns work exactly like numpy arrays

In [5]:
import pandas as pd

In [6]:
date_df = pd.DataFrame({'Date':date_array,'Data':np.random.rand(len(date_array))})
date_df.head()

#Round 2 decimal place
year_rounded = np.round(date_df['Date'],3)

#Round to lowest whole number
year = np.floor(year_rounded)

#Find month
month = np.ceil(12*(date_df['Date']-year)+1)

print(year)
print(month)

#Index columns
#print(date_df[['Date'].head(2),'\n']
      
#Index rows
#print(date_df.loc[0])


0     1880.0
1     1880.0
2     1880.0
3     1880.0
4     1880.0
5     1880.0
6     1880.0
7     1880.0
8     1880.0
9     1880.0
10    1880.0
11    1880.0
12    1881.0
13    1881.0
14    1881.0
15    1881.0
16    1881.0
17    1881.0
18    1881.0
19    1881.0
20    1881.0
21    1881.0
22    1881.0
23    1881.0
24    1882.0
Name: Date, dtype: float64
0      1.0
1      2.0
2      3.0
3      4.0
4      5.0
5      6.0
6      7.0
7      8.0
8      9.0
9     10.0
10    11.0
11    12.0
12     1.0
13     2.0
14     3.0
15     4.0
16     5.0
17     6.0
18     7.0
19     8.0
20     9.0
21    10.0
22    11.0
23    12.0
24     1.0
Name: Date, dtype: float64


In [15]:
#Create new column for year
print(year.head())
date_df['Year'] = year


0    1880.0
1    1880.0
2    1880.0
3    1880.0
4    1880.0
Name: Date, dtype: float64


In [14]:
#Create a new column for month
print(month.head())
date_df['Month'] = month

0    1.0
1    2.0
2    3.0
3    4.0
4    5.0
Name: Date, dtype: float64


### Step 4: Write a function

Write a function that takes any Pandas column with dates in the this format and creates a new dataframe with columns for year and month instead.  

Discuss in groups about what will go into this skeleton for a function and write your pseudo-code in your lab notes.

```def (function_inputs):
    do something
    return function_outputs```
        

In [11]:
#Function goes here
def conv_time(date):
    
    year_rounded = np.round(date['Date'],3)

    year = np.floor(year_rounded)

    month = np.ceil(12*(date-year)+1)
    
    #Create your year and month column
    #date['Year'] = year
    #date['Month'] = month
    #df = pd.DataFrame["Year":date['Year'], "Month": date['Month']]
    
    return date

In [13]:
conv_time(date_df.head())

Unnamed: 0,Date,Data,Year,Month
0,1880.0,0.50649,1880.0,1.0
1,1880.083333,0.570115,1880.0,2.0
2,1880.166667,0.660363,1880.0,3.0
3,1880.25,0.020612,1880.0,4.0
4,1880.333333,0.847077,1880.0,5.0


In [None]:
#Test on date_df

In [21]:
def conv_test(date):
    
    year_rounded = np.round(date['Date'],3)

    year = np.floor(year_rounded)

    month = np.ceil(12*(date-year)+1)
    
    date['Year'] = year
    date['Month'] = month
    df = pd.DataFrame["Year": date['Year'], "Month": date['Month']]
    
    return df

In [22]:
conv_test(date_df)

TypeError: 'type' object is not subscriptable