# Put your questions in order starting here

# Pandas Categorical Part 1

Split out column entries
Load following survey data into a Pandas dataframe called x and note that the top part of the Is there anything in particular you want to use Python for? column looks like the following,
	Is there anything in particular you want to use Python for?
ID	
3931	Data extraction and processing, Data analytics...
4205	Data extraction and processing
3669	Data analytics, Machine learning, Statistical ...
1452	Data extraction and processing, Data analytics...
2968	Numerical processing, Data analytics, Machine ...
The problem with this column is that there are multiple comma-separated values in it. Please write a Python function called split_count that can take this column as input and output the following Pandas dataframe.
	count
All of the above	1
Computer vision	1
Image Processing	1
Computer vision/image processing	1
As a general skill	1
scripting seems desirable for many jobs	1
not sure	1
Computer Vision	1
EDA tools	1
Web development	104
Numerical processing	173
Scientific visualization	198
Statistical analysis	222
Data extraction and processing	291
Data analytics	351
Machine learning	381

Here is the function signature: split_count(x) where x is a pd.Series object and it returns a pd.DataFrame object.

**Validation Tests** <br>
Check for corner cases and constraints in the inputs enlist all cases used for testing

In [None]:
assert isinstance(x,pd.Series)

**Functional Tests** <br>
Check function output matches expected result enlist all cases used for testing

In [None]:
import pandas as pd
survey_data = pd.read_csv('survey_data.csv') 
x = survey_data.loc[:,'Is there anything in particular you want to use Python for?']

new_s = x.str.split(', ').apply(lambda y: pd.Series(y).value_counts()).sum()                
df = new_s.to_frame()                                                                       
df.columns = ['count']                                                                      
df = df.astype({'count':int})                                                               
assert isinstance(split_count(x),pd.DataFrame) # asserting the output is a dataframe
assert split_count(x) == df # assert output of function is dataframe generated above

# Pandas Categorical Part 2

Add a new column using Timestamp column
Using the same survey dataframe from before, create a dataframe column month-yr with ID as row-index like the following,
	month-yr
ID	
3931	Sep-2017
4205	Sep-2017
...	...
2524	Jan-2019
Note that each of the entries is a string. That is, given that your original survey dataframe is x, you should be able to produce the output above from
>>> x['month-yr'] 
Your function add_month_yr(x) should take in the x survey dataframe and then output the same dataframe with a new month-yr column.
Here is the function signature: add_month_yr(x) where x is a pd.DataFrame and returns the same pd.DataFrame with the new column. This means all you have to do is take the input dataframe and add a single column to it.

**Validation Tests** <br>
Check for corner cases and constraints in the inputs enlist all cases used for testing

In [None]:
assert isinstance(x,pd.DataFrame)

**Functional Tests**<br>
Check function output matches expected result enlist all cases used for testing

In [None]:
import pandas as pd
x = pd.read_csv('survey_data.csv') 
assert isinstance(add_month_yr(x),pd.DataFrame) # assert output of function is a dataframe
x['month-yr'] = pd.to_datetime(x['Timestamp']).dt.strftime('%b-%Y')
assert add_month_yr(x) == x # assert output of function is x dataframe with month-yr column

# Rational Numbers

Implement a class of rational numbers (ratio of integers) with the following interfaces and behaviours



**Validation Tests** <br>
Check for corner cases and constraints in the inputs enlist all cases used for testing

In [None]:
# For Rational(numerator, denominator)
assert isinstance(numerator, int), "the numerator must be an integer"
assert isinstance(denominator, int), "the denominator must be an interger"
assert denominator != 0, "the denominator must be non-zero"

**Functional Tests** <br>
Check function output matches expected result enlist all cases used for testing

In [None]:
assert repr(Rational(10,1)) == '10', "check for __repr__ implementation"
assert Rational(20,2) == Rational(10,1), "check for __eq__ implemention and simplification"
assert (sorted([Rational(10,3), Rational(3,10), Rational(5,2), Rational(3,10)]) == [Rational(3,10), Rational(3,10), Rational(5,2), Rational(10,3)]), "check for sorting functionality"
assert Rational(-1,5) + Rational(11,4) * Rational(100,8) - Rational(2,8) == Rational(1357,40), "check for __sub__, __mul__ and __add__ implementations"
assert -Rational(123,2)/7 + 2/Rational(28,5) == Rational(-59, 7), "check for __rtruediv__ and __truediv__ with integer"
assert float(Rational(257,125)) == 2.056, "check for float implementation"
assert int(Rational(10,1)) == 10 , "check for int implementation"

# Rational Square Root


Using your Rational class for representing rational numbers, write a function square_root_rational which takes an input rational number x and returns the square root of x to absolute precision abs_tol. Your function should return a Rational number instance as output. Here is an example,

`square_root_rational(Rational(1112,3),abs_tol=Rational(1,1000)) # output is Rational instance
 10093849/524288`
 


Here is your function signature: square_root_rational(x,abs_tol=Rational(1,1000)).

Hint: Use the bisection algorithm to compute the square root.

**Validation Tests** <br>
Check for corner cases and constraints in the inputs enlist all cases used for testing

In [None]:
assert(isinstance(abs_tol, Rational))

**Functional Tests** <br>
Check function output matches expected result enlist all cases used for testing

In [None]:
import random

# set the low, high range of numerators and denominators to test
low = 1
high = 10000

#how many test trials we want to generate
n = 1
nums = [random.randint(low, high) for i in range(n)]
dens = [random.randint(low, high) for i in range(n)]
tols = [random.randint(low, high) for i in range(n)]

# zip 'em and run 'em
# NOTE: Can definitely take a while for large irrational numbers and/or small tolerances
for num, den, tol in zip(nums, dens, tols):
    operand = Rational(num, den)
    abs_tol = Rational(1,tol)
    assert abs(float(square_root_rational(operand, abs_tol)) - (num/den)**(1/2)) < float(tol), f"[Rational Square Root] Test failed for testcase (numerator={num}, denominator={den}, tol={tol})"
