# Python: Date manipulation

**Goal**: Explore the time and date management modules in Python

## The time module

The **time** module manages the timestamp in a UNIX format, i.e. in the form of an integer which represents the number of seconds elapsed since the year 1970.

In [1]:
# Example
import time

In [2]:
current_time = time.time()
current_time

1641505267.926285

To convert this timestamp into a more readable format, we use the **gmtime()** function of the time module.

In [3]:
current_struct_time = time.gmtime(current_time)
current_struct_time

time.struct_time(tm_year=2022, tm_mon=1, tm_mday=6, tm_hour=21, tm_min=41, tm_sec=7, tm_wday=3, tm_yday=6, tm_isdst=0)

In [4]:
current_hour = current_struct_time.tm_hour
current_hour

21

In [5]:
current_year = current_struct_time.tm_year
current_year

2022

## The datetime module

The **datetime** module offers a better representation for dates and also allows you to perform operations on them.

In [6]:
# Example
import datetime

In [7]:
current_datetime = datetime.datetime.now()
current_datetime

datetime.datetime(2022, 1, 6, 22, 41, 8, 15035)

In [8]:
print(current_datetime)

2022-01-06 22:41:08.015035


In [9]:
current_year = current_datetime.year
current_year

2022

In [10]:
current_month = current_datetime.month
current_month

1

In [11]:
current_day = current_datetime.day
current_day

6

## Timedelta class

The **timedelta** class is very useful to perform operations on objects of type datetime.

In [12]:
# Example 1
today = datetime.datetime.now()
print(today)

2022-01-06 22:41:08.093825


In [13]:
diff = datetime.timedelta(weeks=1, days=5)
print(diff)

12 days, 0:00:00


In [14]:
result = today + diff
print(result)

2022-01-18 22:41:08.093825


In [15]:
# Example 2
diff = datetime.timedelta(days=1)
print(diff)

1 day, 0:00:00


In [16]:
tomorrow = today + diff
print(tomorrow)

2022-01-07 22:41:08.093825


In [17]:
yesterday = today - diff
print(yesterday)

2022-01-05 22:41:08.093825


## Date formatting

The **strftime()** method of the datetime class allows to format the dates the way you want.

In [18]:
# Example
print(today)

2022-01-06 22:41:08.093825


In [19]:
print(type(today))

<class 'datetime.datetime'>


In [20]:
string_today = today.strftime("%d %b %Y")
print(string_today)

06 Jan 2022


It is also possible to do the reverse conversion, i.e. from a string to the time format with the **strptime()** method.

In [21]:
today = time.strptime(string_today, "%d %b %Y")
today

time.struct_time(tm_year=2022, tm_mon=1, tm_mday=6, tm_hour=0, tm_min=0, tm_sec=0, tm_wday=3, tm_yday=6, tm_isdst=-1)

## Applications on our dataset

In [22]:
import csv

In [23]:
f = open("askreddit_2015.csv", encoding='utf-8')
csv_reader_f = csv.reader(f)
posts = list(csv_reader_f)
posts = posts[1:]

In [24]:
posts[0:5]

[['What\'s your internet "white whale", something you\'ve been searching for years to find with no luck?',
  '11510',
  '1433213314.0',
  '1',
  '26195'],
 ["What's your favorite video that is 10 seconds or less?",
  '8656',
  '1434205517.0',
  '4',
  '8479'],
 ['What are some interesting tests you can take to find out about yourself?',
  '8480',
  '1443409636.0',
  '1',
  '4055'],
 ["PhD's of Reddit. What is a dumbed down summary of your thesis?",
  '7927',
  '1440188623.0',
  '0',
  '13201'],
 ['What is cool to be good at, yet uncool to be REALLY good at?',
  '7711',
  '1440082910.0',
  '0',
  '20325']]

Let's convert the dates of our dataset (in UNIX format) into a more understandable format (datetime) by using the **fromtimestamp()** method of the datetime class.

In [25]:
# Example
datetime_object = datetime.datetime.fromtimestamp(1440082910.0)
print(datetime_object)

2015-08-20 17:01:50


### Training

In [26]:
for row in posts:
    
    time_stamp = float(row[2])
    date = datetime.datetime.fromtimestamp(time_stamp)
    
    row[2] = date

In [27]:
print(posts[0:5])

[['What\'s your internet "white whale", something you\'ve been searching for years to find with no luck?', '11510', datetime.datetime(2015, 6, 2, 4, 48, 34), '1', '26195'], ["What's your favorite video that is 10 seconds or less?", '8656', datetime.datetime(2015, 6, 13, 16, 25, 17), '4', '8479'], ['What are some interesting tests you can take to find out about yourself?', '8480', datetime.datetime(2015, 9, 28, 5, 7, 16), '1', '4055'], ["PhD's of Reddit. What is a dumbed down summary of your thesis?", '7927', datetime.datetime(2015, 8, 21, 22, 23, 43), '0', '13201'], ['What is cool to be good at, yet uncool to be REALLY good at?', '7711', datetime.datetime(2015, 8, 20, 17, 1, 50), '0', '20325']]


In [28]:
print(posts[0][2])

2015-06-02 04:48:34


## Counting posts published in May

### Training

In [29]:
may_count = 0

for row in posts:
    if row[2].month == 5:
        may_count += 1

In [30]:
print(may_count)

75


## Count posts from any month

### Training

In [31]:
def count_post_in_month(month):
    count = 0

    for row in posts:
        if row[2].month == month:
            count += 1
            
    return count

In [32]:
months = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10 , 11, 12]

for month in months:
    print(count_post_in_month(month))

52
46
59
58
75
74
77
94
69
78
96
89
