# Convert Unix Time into Human-Readable and Timezone-Adjusted Time

This code is particularly helpful if you are working with time in your data, and you want to be able to adjust times to specific timezones, extract certain components of time (i.e. year, hour, or day), or convert Unix time into a format that is more readable.

The time given to us in our datasets was in Unix format, which displays the time as the number of seconds that has passed since January 1st, 1970. This format is not very helpful for the visualizations that we wanted to make, so we wanted to convert the Unix Time into human-readable time first, and adjust the time based on timezone. We used this code to create our stacked bar chart, where we got the relative frequencies of number of posts by hour of day for each subreddit cluster by using this for loop on our combined Reddit post dataset to get the timezone-adjusted hour of day for each post.

For the purposes of this data manipulation demo, because of data privacy, we've subsetted 100 data observations from the long/lat geographic dataset with fake usernames to replace the 'author' column, which is what we'll be using in this data demo and in our visualizations.

## Import Libraries

Here we are importing the necessary libraries to run the conversion code. 

In [1]:
import os
import numpy as np
import pandas as pd
import pytz
from datetime import datetime, timedelta, date 

## Read in Data

Read in the data that you would like to perform the following data manipulations on.

The `os.getcwd()` method gets the current working directory that you are in, which should be inside the `data_wrangling` folder. However, to access the data file, we need to replace the current working directory with the directory that leads to the file. Once that has been done, we can go head with reading in the data and performing the necessary data manipulations.

In [5]:
DATA_DIR = os.getcwd()
DATA_DIR = DATA_DIR.replace('data_wrangling', 'synthetic_data')

location_data = pd.read_parquet(DATA_DIR + '/geo_known_synthetic.parquet')

## Convert Unix Time to Human-Readable Time

We are using the `.to_datetime()` function to convert the Unix Time into human-readable time.

In [6]:
location_data['converted_time'] = pd.to_datetime(location_data['created_utc'], unit='s')

## Adjust Time by Timezone + Extract Time Components

The long/lat geographical dataset contains timezone information in the following format: "Country/Major City". We used the following for-loop iteration to go through our dataset and convert the Unix Time into human-readable time that is properly adjusted based on the given timezone. We can also specifically extract out parts of the adjusted time, including date, month, day, hour, minute, seconds, and the abbreviated timezone.

In [7]:
for index, row in location_data.iterrows():
    tz = pytz.timezone(row['timezone'])
    ldt = datetime.fromtimestamp(row['created_utc'], tz)
    location_data.loc[index, "time_adj"] = ldt.strftime('%Y-%m-%d %H:%M:%S %Z%z').upper()
    location_data.loc[index, "date"] = ldt.strftime('%Y-%m-%d').upper()
    location_data.loc[index, "month"] = ldt.strftime('%m').upper()
    location_data.loc[index, "day"] = ldt.strftime('%d').upper()
    location_data.loc[index, "hour"] = ldt.strftime('%H').upper()
    location_data.loc[index, "tz_abbrev"] = ldt.strftime('%Z').upper()

In [8]:
location_data

Unnamed: 0,author_synthetic,created_utc,long,lat,timezone,converted_time,time_adj,date,month,day,hour,tz_abbrev
0,user_34450,1520125956,-117.937995,33.774269,America/Los_Angeles,2018-03-04 01:12:36,2018-03-03 17:12:36 PST-0800,2018-03-03,03,03,17,PST
1,user_36853,1542046721,-72.521501,41.775930,America/New_York,2018-11-12 18:18:41,2018-11-12 13:18:41 EST-0500,2018-11-12,11,12,13,EST
2,user_7400,1348026594,-94.578567,39.099727,America/Chicago,2012-09-19 03:49:54,2012-09-18 22:49:54 CDT-0500,2012-09-18,09,18,22,CDT
3,user_11328,1348030322,-85.954041,42.972193,America/Detroit,2012-09-19 04:52:02,2012-09-19 00:52:02 EDT-0400,2012-09-19,09,19,00,EDT
4,user_14264,1350704655,-82.998794,39.961176,America/New_York,2012-10-20 03:44:15,2012-10-19 23:44:15 EDT-0400,2012-10-19,10,19,23,EDT
...,...,...,...,...,...,...,...,...,...,...,...,...
95,user_9319,1348028043,-97.941394,29.883275,America/Chicago,2012-09-19 04:14:03,2012-09-18 23:14:03 CDT-0500,2012-09-18,09,18,23,CDT
96,user_11424,1348030486,-97.585875,40.166393,America/Chicago,2012-09-19 04:54:46,2012-09-18 23:54:46 CDT-0500,2012-09-18,09,18,23,CDT
97,user_22720,1425577949,-80.357827,25.666034,America/New_York,2015-03-05 17:52:29,2015-03-05 12:52:29 EST-0500,2015-03-05,03,05,12,EST
98,user_5268,1332899918,-97.743061,30.267153,America/Chicago,2012-03-28 01:58:38,2012-03-27 20:58:38 CDT-0500,2012-03-27,03,27,20,CDT
