# [Data Analysis with Python](https://www.youtube.com/watch?v=GPVsHOlRBBI) by freeCodeCamp.org
*Instructor: Aakash N S*

*Note: We use Camel case for variable name*
Course Curriculum
- Lesson 1 & 2: Intro to Programming with Python
- Lesson 3: Numerical computiation with Numpy
- Lesson 4: Analyzing tabular data using Pandas
- Lesson 5: Data visualization with Matplotlib and Seaborn
- Lesson 6: Exploratory Data Analysis: A case study
    * Analyze a large real-world dataset
    * Learn to ask the right questions
    * Using the right tools for the job
    * Storytelling and presentation tips

Goal: Create something you can share proudly on your professional profile

Course page: [zerostopandas.com](https://zerotopandas.com/)



# Lesson 1 - Introduction to Programming with Python
* First steps with Python & Jupyter notebooks
* Arithmetic, conditional & logical operators in Python
* Quick tour with Variables and common data types

## First steps with Python & Jupyter notebooks - Syntax of Markdown

\ # This is an <h1>
\ ## This is an <h2>
\ ### This is an <h3>
\ #### This is an <h4>
##### This is an <h5>
###### This is an <h6>

\ This is an h1
\ =============

\ This is an h2
\ -------------

*This text is in italics.*
_And so is this text._

**This text is in bold.**
__And so is this text.__

***This text is in both.***
**_As is this!_**
*__And this!__*

~~This text is rendered with strikethrough.~~

1. Item one
2. Item two
3. Item three
    * Sub-item
    * Sub-item
4. Item four

Boxes below without the 'x' are unchecked HTML checkboxes.
- [ ] First task to complete.
- [ ] Second task that needs done
This checkbox below will be a checked HTML checkbox.
- [x] This task has been completed

> Learn more Markdown Basic Syntax [here](https://learnxinyminutes.com/docs/markdown/)

This is a `Markdown cell` and below is a `Code cell`

## Arithmetic, conditional & logical operators in Python

### Operators



Python supports the following arithmetic operators:

| Operator   | Purpose           | Example     | Result    |
|------------|-------------------|-------------|-----------|
| `+`        | Addition          | `2 + 3`     | `5`       |
| `-`        | Subtraction       | `3 - 2`     | `1`       |
| `*`        | Multiplication    | `8 * 12`    | `96`      |
| `/`        | Division          | `100 / 7`   | `14.28..` |
| `//`       | Floor Division    | `100 // 7`  | `14`      |    
| `%`        | Modulus/Remainder | `100 % 7`   | `2`       |
| `**`       | Exponent          | `5 ** 3`    | `125`     |


Try solving some simple problems from this page:
https://www.math-only-math.com/worksheet-on-word-problems-on-four-operations.html . 

You can use the empty cells below and add more cells if required.


#### [Exercises](https://www.math-only-math.com/worksheet-on-word-problems-on-four-operations.html) (Simple)

1. The population of a town is 198568. Out of them 45312 are men and 35678 are women. Find the number of children in the town.

In [10]:
townPopulation = 198568
numberOfMen = 45312
numberOfWomen = 35678
numberofChildren = townPopulation - (numberOfMen+numberOfWomen)
print("There are {} children.".format(numberofChildren))

There are 117578 children.


13. Maria bought 96 toys priced equally for $12960. The amount of $1015 is still left with her. Find the cost of each toy and the amount she had.

In [9]:
numberOfToys = 96
toysTotalPrice = 12960
amountLeft = 1015

costOfAToy = toysTotalPrice/numberOfToys
initialMoney = toysTotalPrice+amountLeft

print("Each toy costs {} and she had ${} initally.".format(costOfAToy,initialMoney))

Each toy costs 135.0 and she had $13975 initally.


### Conditionals

Apart from arithmetic operations, Python also provides several operations for comparing numbers & variables.

| Operator    | Description                                                     |
|-------------|-----------------------------------------------------------------|
| `==`        | Check if operands are equal                                     |
| `!=`        | Check if operands are not equal                                 |
| `>`         | Check if left operand is greater than right operand             |
| `<`         | Check if left operand is less than right operand                |
| `>=`        | Check if left operand is greater than or equal to right operand |
| `<=`        | Check if left operand is less than or equal to right operand    |

The result of a comparison operation is either `True` or `False` (note the uppercase `T` and `F`). These are special keywords in Python. Let's try out some experiments with comparison operators.

## Quick tour with Variables and common data types

`Variables` are containers for `values`

A variable can have a short name (like x and y) or a more descriptive name (age, carname, total_volume).

Rules for Python variables:

- A variable name must start with a letter or the underscore character
- A variable name cannot start with a number
- A variable name can only contain alpha-numeric characters and underscores (A-z, 0-9, and _ )
- Variable names are case-sensitive (age, Age and AGE are three different variables)

>  Syntax: The syntax of a programming language refers to the rules that govern the structure of a valid instruction or statement. If a statement does not follow these rules, Python stops execution and informs you that there is a syntax error. You can think of syntax as the rules of grammar for a programming language.

### Built-in datatypes

Any data or information stored within a Python variable has a type. You can view the type of data stored within a variable using the type function.



Python has several built-in data types for storing different kinds of information in variables. Following are some commonly used data types:

1. Integer
2. Float
3. Boolean
4. None
5. String
6. List
7. Tuple
8. Dictionary

Integer, float, boolean, None, and string are *primitive data types* because they represent a single value. Other data types like list, tuple, and dictionary are often called *data structures* or *containers* because they hold multiple pieces of data together.



### Helpful methods

#### `.format` method 
The `.format` method combines values of other data types, e.g., integers, floats, booleans, lists, etc. with strings. You can use `format` to construct output messages for display.

In [3]:
# Input variables
cost_of_ice_bag = 1.25
profit_margin = .2
number_of_bags = 500

# Template for output message
output_template = """If a grocery store sells ice bags at $ {} per bag, with a profit margin of {} %, 
then the total profit it makes by selling {} ice bags is $ {}."""

print(output_template)


If a grocery store sells ice bags at $ {} per bag, with a profit margin of {} %, 
then the total profit it makes by selling {} ice bags is $ {}.


In [4]:
# Inserting values into the string
total_profit = cost_of_ice_bag * profit_margin * number_of_bags
output_message = output_template.format(cost_of_ice_bag, profit_margin*100, number_of_bags, total_profit)

print(output_message)

If a grocery store sells ice bags at $ 1.25 per bag, with a profit margin of 20.0 %, 
then the total profit it makes by selling 500 ice bags is $ 125.0.


#### Formatting using %

The `%` operator is used to format a set of variables enclosed in a `tuple` (a fixed size list), together with a format string, which contains normal text together with `argument specifiers`, special symbols like `%s` and `%d`.

Here are some basic argument specifiers you should know:

`%s` - String (or any object with a string representation, like numbers)

`%d` - Integers

`%f` - Floating point numbers

`%.<number of digits>f` - Floating point numbers with a fixed amount of digits to the right of the dot.

`%x/%X` - Integers in hex representation (lowercase/uppercase)

Source: [www.learnpython.org](https://www.learnpython.org/en/String_Formatting)

In [5]:
# This prints out "John is 23 years old."
name = "John"
age = 23
print("%s is %d years old." % (name, age))

John is 23 years old.


In [6]:
# This prints out: A list: [1, 2, 3]
mylist = [1,2,3]
print("A list: %s" % mylist)

A list: [1, 2, 3]


# Lesson 2 - Next Steps with Python
- Branching with if, elif, and else
- Iteration with while and for loops
- Write reusable code with Functions
- Scope of variables and exceptions

## Branching with if, elif, and else


- Branching with `if`, `else` and `elif`
- Nested conditions and `if` expressions
- Iteration with `while` loops
- Iterating over containers with `for` loops
- Nested loops, `break` and `continue` statements

### The `pass` statement

`if` statements cannot be empty, there must be at least one statement in every `if` and `elif` block. You can use the `pass` statement to do nothing and avoid getting an error.

### Shorthand if conditional expression
Python provides a shorter syntax, which allows writing such conditions in a single line of code. It is known as a *conditional expression*, sometimes also referred to as a *ternary operator*. It has the following syntax:

```
x = true_value if condition else false_value
```

It has the same behavior as the following `if`-`else` block:

```
if condition:
    x = true_value
else:
    x = false_value
```

Let's try it out for the example above.

### Statements and Expressions

The conditional expression highlights an essential distinction between *statements* and *expressions* in Python. 

> **Statements**: A statement is an instruction that can be executed. Every line of code we have written so far is a statement e.g. assigning a variable, calling a function, conditional statements using `if`, `else`, and `elif`, loops using `for` and `while` etc.

> **Expressions**: An expression is some code that evaluates to a value. Examples include values of different data types, arithmetic expressions, conditions, variables, function calls, conditional expressions, etc. 


Most expressions can be executed as statements, but not all statements are expressions. For example, the regular `if` statement is not an expression since it does not evaluate to a value. It merely performs some branching in the code. Similarly, loops and function definitions are not expressions (we'll learn more about these in later sections).

As a rule of thumb, an expression is anything that can appear on the right side of the assignment operator `=`. You can use this as a test for checking whether something is an expression or not. You'll get a syntax error if you try to assign something that is not an expression.

## Iteration with while and for loops

### `break`, `continue` and `pass` statements

Similar to `while` loops, `for` loops also support the `break` and `continue` statements. `break` is used for breaking out of the loop and `continue` is used for skipping ahead to the next iteration.

### Non-Boolean Conditions

Note that conditions do not necessarily have to be booleans. In fact, a condition can be any value. The value is converted into a boolean automatically using the `bool` operator. This means that falsy values like `0`, `''`, `{}`, `[]`, etc. evaluate to `False` and all other values evaluate to `True`.

In [1]:
if '':
    print('The condition evaluted to True')
else:
    print('The condition evaluted to False')

The condition evaluted to False


In [2]:
if { 'a': 34 }:
    print('The condition evaluted to True')
else:
    print('The condition evaluted to False')

The condition evaluted to True


### Nesting

> Nested `if`, `else` statements are often confusing to read and prone to human error. It's good to avoid nesting whenever possible, or limit the nesting to 1 or 2 levels.

In [4]:
a_number = 13

if a_number % 2 == 0:
    parity = 'even'
else:
    parity = 'odd'

print('The number {} is {}.'.format(a_number, parity))

The number 13 is odd.


In [5]:
parity = 'even' if a_number % 2 == 0 else 'odd'
print('The number {} is {}.'.format(a_number, parity))

The number 13 is odd.


### Time

You can check **how long a cell takes to execute** by adding the magic command `%%time` at the top of a cell.

`%%time` prints the wall time for the entire cell whereas %time gives you the time for first line only

Using `%%time` or `%time` prints 2 values:

1. CPU Times
2. Wall Time

In [6]:
%%time

result = 1
i = 1

while i <= 1000:
    result *= i # same as result = result * i
    i += 1 # same as i = i+1

print(result)

4023872600770937735437024339230039857193748642107146325437999104299385123986290205920442084869694048004799886101971960586316668729948085589013238296699445909974245040870737599188236277271887325197795059509952761208749754624970436014182780946464962910563938874378864873371191810458257836478499770124766328898359557354325131853239584630755574091142624174743493475534286465766116677973966688202912073791438537195882498081268678383745597317461360853795345242215865932019280908782973084313928444032812315586110369768013573042161687476096758713483120254785893207671691324484262361314125087802080002616831510273418279777047846358681701643650241536913982812648102130927612448963599287051149649754199093422215668325720808213331861168115536158365469840467089756029009505376164758477284218896796462449451607653534081989013854424879849599533191017233555566021394503997362807501378376153071277619268490343526252000158885351473316117021039681759215109077880193931781141945452572238655414610628921879602238389714760

### `break` and `continue` statements

You can use the `break` statement within the loop's body to immediately stop the execution and *break* out of the loop (even if the condition provided to `while` still holds true).

Sometimes you may not want to end the loop entirely, but simply skip the remaining statements in the loop and *continue* to the next loop. You can do this using the `continue` statement.

### Iterating using `range` and `enumerate`

The `range` function is used to create a sequence of numbers that can be iterated over using a `for` loop. It can be used in 3 ways:
 
* `range(n)` - Creates a sequence of numbers from `0` to `n-1`
* `range(a, b)` - Creates a sequence of numbers from `a` to `b-1`
* `range(a, b, step)` - Creates a sequence of numbers from `a` to `b-1` with increments of `step`

Let's try it out.

## Write reusable code with Functions

- Creating and using functions in Python
- Local variables, return values, and optional arguments
- Reusing functions and using Python library functions
- Exception handling using `try`-`except` blocks
- Documenting functions using docstrings

### Creating and using functions in Python


A function is a reusable set of instructions that takes one or more inputs, performs some operations, and often returns an output. 

You can define a new function using the `def` keyword.

In [8]:
def say_hello():
    print('Hello there!')
    print('How are you?')

The statements inside a function's body are not executed when the function is defined. To execute the statements, we need to *call* or *invoke* the function.

In [9]:
say_hello()

Hello there!
How are you?


#### Function arguments

Functions can accept zero or more values as *inputs* (also knows as *arguments* or *parameters*). Arguments help us write flexible functions that can perform the same operations on different values. Further, functions can return a result that can be stored in a variable or used in other expressions.

Here's a function that filters out the even numbers from a list and returns a new list using the `return` keyword.

In [11]:
def filter_even(number_list):
    result_list = []
    for number in number_list:
        if number % 2 == 0:
            result_list.append(number)
    return result_list

even_list = filter_even([1, 2, 3, 4, 5, 6, 7])
print(even_list)

[2, 4, 6]


### Local variables, return values, and optional arguments

#### Writing great functions in Python

##### Local variables and scope

Local variable
- the variable defined inside the function is not accessible outside.  These are all local variables that lie within the scope of the function.

Scope
- Scope refers to the region within the code where a particular variable is visible. Every function (or class definition) defines a scope within Python. Variables defined in this scope are called local variables. Variables that are available everywhere are called global variables. Scope rules allow you to use the same variable names in different functions without sharing values from one to the other.



##### Return values
Return result of function to store the results in variables

##### Optional arguments


An argument in Python that has a default value is called an optional argument. A default value for an argument can be specified using the assignment operator. When calling a function, there is no requirement to provide a value for an optional argument.

Source: [www.tutorialspoint.com](https://www.tutorialspoint.com/How-to-pass-optional-parameters-to-a-function-in-Python#:~:text=Optional%20Arguments%20in%20Python,value%20for%20an%20optional%20argument.)

In [17]:
def loan_emi(amount, duration, down_payment=0):
    loan_amount = amount - down_payment
    emi = loan_amount / duration
    return emi

In [16]:
emi1 = loan_emi(1260000, 8*12, 3e5)
print(emi1)

10000.0


In [15]:
emi2 = loan_emi(1260000, 10*12)
print(emi2)

10500.0


##### Named arguments

Invoking a function with many arguments can often get confusing and is prone to human errors. Python provides the option of invoking functions with *named* arguments for better clarity. You can also split function invocation into multiple lines.

In [22]:
def loan_emi(amount, duration, rate, down_payment=0):
    loan_amount = amount - down_payment
    emi = loan_amount * rate * ((1+rate)**duration) / (((1+rate)**duration)-1)
    return emi

In [24]:
emi1 = loan_emi(
    amount=1260000, 
    duration=8*12, 
    rate=0.1/12, 
    down_payment=3e5
)

print(emi1)

14567.19753389219


In [25]:
emi2 = loan_emi(amount=1260000, duration=10*12, rate=0.08/12)
print(emi2)

15287.276888775077


### Reusing functions and using Python library functions

#### Modules and library functions

Modules
- Modules are files containing Python code (variables, functions, classes, etc.). They provide a way of organizing the code for large Python projects into files and folders. The key benefit of using modules is namespaces: you must import the module to use its functions within a Python script or notebook. Namespacesa provide encapsulation and avoid naming conflicts between your code and a module or across modules.

In [27]:
import math

In [28]:
help(math.ceil)

Help on built-in function ceil in module math:

ceil(x, /)
    Return the ceiling of x as an Integral.
    
    This is the smallest integer >= x.




## Scope of variables and exceptions


### Exception handling using `try`-`except` blocks

#### Exceptions and try-except

You can use the `try` and `except` statements to handle an exception.

Exception
- Even if a statement or expression is syntactically correct, it may cause an error when the Python interpreter tries to execute it. Errors detected during execution are called exceptions. Exceptions typically stop further execution of the program unless handled within the program using try-except statements.

In [29]:
try:
    print("Now computing the result..")
    result = 5 / 0
    print("Computation was completed successfully")
except ZeroDivisionError:
    print("Failed to compute result because you were trying to divide by zero")
    result = None

print(result)

Now computing the result..
Failed to compute result because you were trying to divide by zero
None


#### Documenting functions using Docstrings

We can add some documentation within our function using a *docstring*. A docstring is simply a string that appears as the first statement within the function body, and is used by the `help` function. A good docstring describes what the function does, and provides some explanation about the arguments.

### Documenting functions using docstrings

In [31]:
def loan_emi(amount, duration, rate, down_payment=0):
    """Calculates the equal montly installment (EMI) for a loan.
    
    Arguments:
        amount - Total amount to be spent (loan + down payment)
        duration - Duration of the loan (in months)
        rate - Rate of interest (monthly)
        down_payment (optional) - Optional intial payment (deducted from amount)
    """
    loan_amount = amount - down_payment
    try:
        emi = loan_amount * rate * ((1+rate)**duration) / (((1+rate)**duration)-1)
    except ZeroDivisionError:
        emi = loan_amount / duration
    emi = math.ceil(emi)
    return emi

In [32]:
help(loan_emi)

Help on function loan_emi in module __main__:

loan_emi(amount, duration, rate, down_payment=0)
    Calculates the equal montly installment (EMI) for a loan.
    
    Arguments:
        amount - Total amount to be spent (loan + down payment)
        duration - Duration of the loan (in months)
        rate - Rate of interest (monthly)
        down_payment (optional) - Optional intial payment (deducted from amount)



## Exercise - Data Analysis for Vacation Planning



You're planning a vacation, and you need to decide which city you want to visit. You have shortlisted four cities and identified the return flight cost, daily hotel cost, and weekly car rental cost. While renting a car, you need to pay for entire weeks, even if you return the car sooner.


| City | Return Flight (`$`) | Hotel per day (`$`) | Weekly Car Rental  (`$`) | 
|------|--------------------------|------------------|------------------------|
| Paris|       200                |       20         |          200           |
| London|      250                |       30         |          120           |
| Dubai|       370                |       15         |          80           |
| Mumbai|      450                |       10         |          70           |         


Answer the following questions using the data above:

1. If you're planning a 1-week long trip, which city should you visit to spend the least amount of money?
2. How does the answer to the previous question change if you change the trip's duration to four days, ten days or two weeks?
3. If your total budget for the trip is `$1000`, which city should you visit to maximize the duration of your trip? Which city should you visit if you want to minimize the duration?
4. How does the answer to the previous question change if your budget is `$600`, `$2000`, or `$1500`?

*Hint: To answer these questions, it will help to define a function `cost_of_trip` with relevant inputs like flight cost, hotel rate, car rental rate, and duration of the trip. You may find the `math.ceil` function useful for calculating the total cost of car rental.*

### Solutions
1. If you're planning a 1-week long trip, which city should you visit to spend the least amount of money?

In [132]:
import math
def vacationCost(city:str,vacationDay:int,returnFlight:float,hotelPerDay:float,weeklyCarRental:float):
    week = math.ceil(vacationDay/7)
    return (city,returnFlight+(hotelPerDay*vacationDay)+(weeklyCarRental*week),vacationDay)

def leastCost(cost:list):
    cost.sort(key = lambda x: x[1])
    return cost[0]

oneWeek = leastCost([vacationCost("Paris",7,200,20,200),vacationCost("London",7,250,30,120),vacationCost("Dubai",7,370,15,80),vacationCost("Mumbai",7,450,10,70)])

print("You shoud visit %s for $%.2f to spend the least amount of money" % (oneWeek[0],oneWeek[1]))

You shoud visit Paris for $540.00 to spend the least amount of money


2. How does the answer to the previous question change if you change the trip's duration to four days, ten days or two weeks?

In [133]:
fourDays = leastCost([vacationCost("Paris",4,200,20,200),vacationCost("London",4,250,30,120),vacationCost("Dubai",4,370,15,80),vacationCost("Mumbai",4,450,10,70)])
tenDays = leastCost([vacationCost("Paris",10,200,20,200),vacationCost("London",10,250,30,120),vacationCost("Dubai",10,370,15,80),vacationCost("Mumbai",10,450,10,70)])
twoWeeks = leastCost([vacationCost("Paris",14,200,20,200),vacationCost("London",14,250,30,120),vacationCost("Dubai",14,370,15,80),vacationCost("Mumbai",14,450,10,70)])

print("Trip Duration\tCity to choose\tCost")
print("4 days\t\t%s\t\t%.2f" % (fourDays[0],fourDays[1]))
print("2 weeks\t\t%s\t\t%.2f" % (oneWeek[0],oneWeek[1]))
print("10 days\t\t%s\t\t%.2f" % (tenDays[0],tenDays[1]))
print("2 weeks\t\t%s\t\t%.2f" % (twoWeeks[0],twoWeeks[1]))



Trip Duration	City to choose	Cost
4 days		Paris		480.00
2 weeks		Paris		540.00
10 days		Dubai		680.00
2 weeks		Mumbai		730.00



3. If your total budget for the trip is `$1000`, which city should you visit to maximize the duration of your trip? Which city should you visit if you want to minimize the duration?

In [137]:
# Note Cost = Returnflight + (durationInDays * hotelPerday) + (math.ceil(durationInDays/7) * weeklyCarRental)
# durationInDays = (Cost - Returnflight) / (hotelPerday + weeklyCarRental/7)


def greatestCost(cost:list):
    cost.sort(key = lambda x: x[1])
    return cost[-1]

def maximizeDuration(budget:float, cityCost):
    day = 1
    city = (None,0,day)
    prev = city
    while True:
        if city[1] > budget:
            break
        else:
            prev = city
        city = leastCost([vacationCost("Paris",day,200,20,200),vacationCost("London",day,250,30,120),vacationCost("Dubai",day,370,15,80),vacationCost("Mumbai",day,450,10,70)])
        day += 1
    return (prev[0],prev[1],prev[2])

def minimizeDuration(budget:float, cityCost):
    day = 1
    city = (None,0,day)
    prev = city
    while True:
        if city[1] > budget:
            break
        else:
            prev = city
        city = greatestCost([vacationCost("Paris",day,200,20,200),vacationCost("London",day,250,30,120),vacationCost("Dubai",day,370,15,80),vacationCost("Mumbai",day,450,10,70)])
        day += 1
    return (prev[0],prev[1],prev[2])


cityCost = [("Paris",4,200,20,200),("London",4,250,30,120),("Dubai",4,370,15,80),("Mumbai",4,450,10,70)]

cityMaxed1000 = maximizeDuration(1000,cityCost)
cityMinimized1000 = minimizeDuration(1000,cityCost)

print("To maximize $1000 budget, you should visit %s with cost of %.2f for %d days."%(cityMaxed1000[0],cityMaxed1000[1], cityMaxed1000[2]))
print("To minimize $1000 budget, you should visit %s with cost of %.2f for %d days."%(cityMinimized1000[0],cityMinimized1000[1],cityMinimized1000[2]))

To maximize $1000 budget, you should visit Mumbai with cost of 1000.00 for 27 days.
To minimize $1000 budget, you should visit London with cost of 910.00 for 14 days.


4. How does the answer to the previous question change if your budget is `$600`, `$2000`, or `$1500`?

In [None]:
print("Maximize:")



print("Minimize:")