<img src="https://user-images.strikinglycdn.com/res/hrscywv4p/image/upload/c_limit,fl_lossy,h_300,w_300,f_auto,q_auto/1266110/Logo_wzxi0f.png" style="float: left; margin: 20px; height: 55px">

# Day 7 - Exercises Solutions

## Categorical Data

### What is Categorical Data

Categorical data has a limited number of values to choose from for a field of data. Some examples of fields and values are:

Field | Potential Values
--- | ----
Blood type | O negative, O positive, A negative, B negative
Customer responses on satisfaction of a product | happy, content, sad
Eye color | green, blue, brown

There are two common types of categorical data: nominal and ordinal.

Nominal categorical data has values with no inherent order such as the eye color example above. 

Ordinal categorical data contains values with an intended order. One example is the customer responses above. There's an inherent order with the values - happy is a more positive measurement than content. In my list of potential values, I ordered the values from responses that deem the product most-likeable to least-likeable.

### Categorical Data in Pandas

Generally, the pandas data type of categorical columns is similar to simply strings of text or numerical values. However, with using ordinal categorical data types, there's a few small differences that would affect my typical workflow. Those differences in pandas are sorting as well as calculuating the minimum and maximum values in a column.

### Import Modules

In [0]:
import pandas as pd

### Create Survey Responses Data

Create a Python list of survey responses that are either `happy`, `content`, or `sad`.

In [0]:
responses = ["happy", "happy", "content", "content", "content", "content", "happy", "content", "sad", "sad", "sad", "sad", "sad", "sad"]

Create a pandas categorical data structure of these responses; set the `ordered` argument to `True` so that order is declared by the `categories` argument which is the rank of responses in the order of `happy`, `content`, or `sad`.

In [0]:
survey_responses = pd.Categorical(responses, categories=["happy", "content", "sad"], ordered=True)

View the data type of `survey_responses`.

In [0]:
type(survey_responses)

pandas.core.categorical.Categorical

Create a pandas DataFrame with one column called `response` with the `survey_responses` data structure.

In [0]:
df_survey_responses = pd.DataFrame({"response": survey_responses})

### Analyze Survey Responses Data

Preview the first 5 rows of `df_survey_responses`.

In [0]:
df_survey_responses.head()

Unnamed: 0,response
0,happy
1,happy
2,content
3,content
4,content


#### Descriptive Statistics

Use the `describe()` method on a Pandas DataFrame to get statistics of columns or you could call this method directly on a series. We'll call it on the DataFrame below.

- `count` shows the number of responses
- `unique` shows the number of unique categorical values
- `top` shows the highest-occuring categorical value
- `freq` shows the frequency/count of the highest-occuring categorical value

In [0]:
df_survey_responses.describe()

Unnamed: 0,response
count,14
unique,3
top,sad
freq,6


#### Sorting

Sort the responses in the `response` column by ascending order and you'll see they appear with `high` at the top and `low` at the bottom.

In [0]:
df_survey_responses.sort_values(by='response').head(10)

Unnamed: 0,response
0,happy
1,happy
6,happy
2,content
3,content
4,content
5,content
7,content
8,sad
9,sad


#### Count of unique occurences of survey responses

Call the `value_counts()` method on the `response` column to get a count of occurences for each of the categorical responses. Notice how `low` was mentioned the most and `high` the least.

In [0]:
df_survey_responses['response'].value_counts()

sad        6
content    5
happy      3
Name: response, dtype: int64

#### Calculate the Least-Occuring Value in the `response` Column

The result of a pandas Series `min()` method may be different than what you expect. We're returned `happy` because it's the least-occuring category type in the `response` column. Only 3 responses included `happy` and there's more responses of the `content` and `sad` categories.

In [0]:
df_survey_responses['response'].min()

'happy'

#### Calculate Most-Occuring Value in `response` Column

Call the `max()` method on the `response` column and we're returned `sad` which is the most-occuring categorical value.

In [0]:
df_survey_responses['response'].max()

'sad'

You can learn more about the differences in working with categorical data in Pandas from the <a href='https://pandas.pydata.org/pandas-docs/stable/categorical.html'>official documentation page</a>. 

## Datetime Review

In [0]:
import time
import datetime

Write a Python script to display the various Date Time formats.

- Current date and time
- Current year
- Month of year
- Week number of the year
- Weekday of the week
- Day of year
- Day of the month
- Day of week

In [0]:
print("Current date and time: " , datetime.datetime.now())
print("Current year: ", datetime.date.today().strftime("%Y"))
print("Month of year: ", datetime.date.today().strftime("%B"))
print("Week number of the year: ", datetime.date.today().strftime("%W"))
print("Weekday of the week: ", datetime.date.today().strftime("%w"))
print("Day of year: ", datetime.date.today().strftime("%j"))
print("Day of the month : ", datetime.date.today().strftime("%d"))
print("Day of week: ", datetime.date.today().strftime("%A"))

Current date and time:  2019-09-16 14:56:17.883103
Current year:  2019
Month of year:  September
Week number of the year:  37
Weekday of the week:  1
Day of year:  259
Day of the month :  16
Day of week:  Monday


Write a Python program to determine whether a given year is a leap year.

In [0]:
def leap_year(y):
    if y % 400 == 0:
        return True
    if y % 100 == 0:
        return False
    if y % 4 == 0:
        return True
    else:
        return False

In [0]:
print(leap_year(1900))
print(leap_year(2004))

False
True


Write a Python program to convert a string to datetime.

In [0]:
date_object = datetime.datetime.strptime('Jul 1 2014 2:43PM', '%b %d %Y %I:%M%p')
print(date_object)

2014-07-01 14:43:00


Write a Python program to subtract five days from current date.

In [0]:
dt = datetime.date.today() - datetime.timedelta(5)
print('Current Date :',datetime.date.today())
print('5 days before Current Date :',dt)


Current Date : 2019-09-16
5 days before Current Date : 2019-09-11


Write a Python program to convert unix timestamp string to readable date.

In [0]:
print(
    datetime.datetime.fromtimestamp(
        int("1284105682")
    ).strftime('%Y-%m-%d %H:%M:%S')
)

2010-09-10 09:01:22


Write a Python program to print next 5 days starting from today.

In [0]:
base = datetime.datetime.today()
for x in range(0, 5):
      print(base + datetime.timedelta(days=x))

2019-09-16 15:07:57.726573
2019-09-17 15:07:57.726573
2019-09-18 15:07:57.726573
2019-09-19 15:07:57.726573
2019-09-20 15:07:57.726573


Write a Python program to convert Year/Month/Day to Day of Year.

In [0]:
today = datetime.datetime.now()
day_of_year = (today - datetime.datetime(today.year, 1, 1)).days + 1
print(day_of_year)

259


Write a Python program to get week number.

In [0]:
print(datetime.date(2015, 6, 16).isocalendar()[1])

25


Write a Python program to find the date of the first Monday of a given week.

In [0]:
print(time.asctime(time.strptime('2015 50 1', '%Y %W %w')))

Mon Dec 14 00:00:00 2015


Write a Python program to select all the Sundays of a specified year.

In [0]:
def all_sundays(year):
# January 1st of the given year
       dt = datetime.date(year, 1, 1)
# First Sunday of the given year       
       dt += datetime.timedelta(days = 6 - dt.weekday())  
       while dt.year == year:
          yield dt
          dt += datetime.timedelta(days = 7)

In [0]:
for s in all_sundays(2020):
   print(s)

2020-01-05
2020-01-12
2020-01-19
2020-01-26
2020-02-02
2020-02-09
2020-02-16
2020-02-23
2020-03-01
2020-03-08
2020-03-15
2020-03-22
2020-03-29
2020-04-05
2020-04-12
2020-04-19
2020-04-26
2020-05-03
2020-05-10
2020-05-17
2020-05-24
2020-05-31
2020-06-07
2020-06-14
2020-06-21
2020-06-28
2020-07-05
2020-07-12
2020-07-19
2020-07-26
2020-08-02
2020-08-09
2020-08-16
2020-08-23
2020-08-30
2020-09-06
2020-09-13
2020-09-20
2020-09-27
2020-10-04
2020-10-11
2020-10-18
2020-10-25
2020-11-01
2020-11-08
2020-11-15
2020-11-22
2020-11-29
2020-12-06
2020-12-13
2020-12-20
2020-12-27


Write a Python program to get days between two dates.

In [0]:
a = datetime.date(2000,2,28)
b = datetime.date(2001,2,28)
print(b-a)

366 days, 0:00:00


Write a Python program to print a string five times, delay three seconds.

In [0]:
x=0
print("\nBelgrave Valley will print five  times, delay for three seconds.")
while x<5:
    print("Belgrave Valley")
    time.sleep(3)
    x=x+1


Belgrave Valley will print five  times, delay for three seconds.
Belgrave Valley
Belgrave Valley
Belgrave Valley
Belgrave Valley
Belgrave Valley


Write a Python program to get the GMT and local current time.



In [0]:
print("\nGMT: "+time.strftime("%a, %d %b %Y %I:%M:%S %p %Z", time.gmtime()))
print("Local: "+time.strftime("%a, %d %b %Y %I:%M:%S %p %Z\n"))


GMT: Mon, 16 Sep 2019 02:15:55 PM GMT
Local: Mon, 16 Sep 2019 03:15:55 PM BST

