<a href="https://colab.research.google.com/github/sijuswamy/PyWorks/blob/main/Introduction_to_Python.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

>**Lesson Outcome**

Upon successful completion of this session the participants will be able to:

1. understand the advantages of learning `python` for Statistical analysis

## Why `python` is a right choice for Computational Statistics ?

Python is a popular programming language in scientific computing, because it has many data-oriented feature packages that can speed up and simplify data processing, thus saving time.

## What is Python?

- `Python` is an *object-oriented*, high-level programming language.

- Its built-in data structures and properties, combined with dynamic typing (meaning we don’t have to declare the type of variable like in C or Java) and binding, make it ideal for application development and its use as a scripting language.

- Python’s simple syntax emphasizes readability and reduces programming maintenance.

- Python is a programming language that can be used for a wide range of tasks such as software creation, web development, script writing, unlike HTML, CSS, and JavaScript.

## What is data analysis?
Data analysis is the process of gathering raw data and converting it into information that the users can use to make decisions.

It entails inspecting, cleansing, transforming, and modeling data to uncover valuable information, draw conclusions, and aid decision-making.

In today’s business world, data analysis plays an important role in making scientific decisions and assisting businesses in operating more efficiently.

**Data mining**  is a form of a data analysis technique that emphasizes statistical modeling and information exploration for predictive rather than strictly descriptive purposes.

**Business intelligence** encompasses data analysis that is heavily reliant on aggregation, focusing primarily on business information and decision-making to boost profit turnover.



## Who is a data analyst?
Data analysts are in charge of interpreting data, analyzing the results using statistical techniques, and producing regular reports.

- They plan and carry out data assessments, data collection processes, and other statistical reliability and quality-improvement techniques.

- They are also in charge of data processing and database management from primary and secondary sources.



## What makes Python a brilliant choice for data analysis?

- Easy to learn

- Flexibility

- Huge libraries collection

- Graphics and visualization
- Built-in data analytics tools


#... So let's jump into  `PyWorks 1.0`...


We will be using `Python` a fair amount in this class. `Python` is a high-level scripting language that offers an interactive programming environment. We assume programming experience, so this workshop will focus on the unique properties of `Python`.

Programming languages generally have the following common ingredients: 

>*variables, operators, iterators, conditional statements, functions (built-in and user defined) and higher-order data structures*.

 We will look at these in Python and highlight qualities unique to this language.

## Variables
Variables in `Python` are defined and assigned for you when you set a value to them.

>Example:

```
# assign 2 to the variable 'py_variable'
py_variable= 2 
print("Value storedis",py_variable) # displays the value stored in 'py_variable'
type(py_variable)  # return the data type of 'py_variable'
```

Just copy and paste this `python` code chunk in a code cell and run to get the output.

In [None]:
# assign 2 to the variable 'py_variable'
py_variable= 2 
print("Value storedis",py_variable) # displays the value stored in 'py_variable'
type(py_variable)  # return the data type of 'py_variable'

2


int

This makes variable definition easy for the programmer. As usual, though, great power comes with great responsibility. For example:



In [None]:
py_varible = py_variable+100
print (py_variable)
     

2


> **Note:** If you accidentally mistype a variable name, `Python` will not catch it for you. This can lead to bugs that can be hard to track - so beware.

## Types and Typecasting
The usual typecasting is available in `Python`, so it is easy to convert `strings` to `ints` or floats, `floats` to `ints`, etc. The syntax is slightly different than C:

In [None]:
a = "1"
b = 5 
print(a+b)

TypeError: ignored

This error can by corrected using `type casting` as shown below:

>Note that the typing is dynamic. I.e. a variable that was initally say an integer can become another type (float, string, etc.) via reassignment.



In [None]:
a = "1"
b = 5
print(int(a)+b)

6


## Operators

`Python` offers the usual operators such as `+,-,/,*,=,>,<,==,!=,&,|`, (sum, difference, divide, product, assignment, greater than, less than, equal - comparison,not equal, and, or, respectively).
Additionally, there are `%,// and **` (modulo, floor division and 'to the power'). Note a few specifics:

In [None]:
print(3/4)
print(3.0 / 4.0)
print(3%4)
print(3//4)
print(3**4)

0.75
0.75
3
0
81


>Note the behavior of / when applied to integers! This is similar to the behavior of other strongly typed languages such as C/C++. The result of the integer division is the same as the floor division //. If you want the floating point result, the arguments to / must be floats as well (or appropriately typecast).

In [None]:
a = 3
b = 4
print(a/b)
print(float(a)/float(b))

0.75
0.75


## Iterators
`Python` has the usual iterators, `while, for`, and some other constructions that will be addressed later. Here are examples of each:

In [None]:
# example of for loop
for i in range(1,10):
     print(i)
     


1
2
3
4
5
6
7
8
9


>The most important thing to note above is that the range function gives us values up to, but not including, the upper limit.



In [None]:
# example of while loop
i = 1
while i < 10:
    print(i)
    i+=1

## Conditional Statements

In [None]:
a = 20
if a >= 22:
   print("if")
elif a >= 21:
    print("elif")
else:
    print("else")
     

>Again, nothing remarkable here, just need to learn the syntax. Here, we should also mention spacing. Python is picky about indentation - you must start a newline after each conditional statemen (it is the same for the iterators above) and indent the same number of spaces for every statement within the scope of that condition.

## Exceptions in `Python`
Python has another type of conditional expression that is very useful. Suppose your program is processing user input or data from a file. You don't always know for sure what you are getting in that case, and this can lead to problems. The `try/except` conditional can solve them!

In [None]:
a = "1"

try:
  b = a + 2 
except:
  print(a, " is not a number") 


1  is not a number


## Functions

We can write repeated process in the form of user defined functions. Syntax of `function` in `Python` is:

```
def <function name>(arguments):
  function body
```
>**Example:** `Python` function to calculate division.  


In [None]:
# function definition
def Division(a, b):
    print(a/b)

In [None]:
# calling the function

Division(1,9)

0.1111111111111111


In [None]:
Division(0.5,8)

0.0625


### Modified version with `try-exception` options

In [None]:
def Division(a, b):
    try:
        print(a/b)
    except:
        if b == 0:
           print("cannot divide by zero")
        else:
           print(float(a)/float(b))


In [None]:
Division(2,"2")
Division(2,0)

1.0
cannot divide by zero


## Strings and String Handling
One of the most important features of `Python` is its powerful and easy handling of strings. Defining strings is simple enough in most languages. But in `Python`, it is easy to search and replace, convert cases, concatenate, or access elements. 

In [None]:
a = "A string of characters, with newline \n CAPITALS, etc."
print(a)
b=5.0
newstring = a + "\n We can format strings for printing %.2f"
print(newstring %b)

A string of characters, with newline 
 CAPITALS, etc.
A string of characters, with newline 
 CAPITALS, etc.
 We can format strings for printing 5.00


In [None]:
a=4.5
b=6.7
sum=a+b
print(("Sum of {} and {} is {}").format(a,b, sum))

Sum of 4.5 and 6.7 is 11.2


In [None]:
# slicing operation on strings
a = "ABC DEFG"
print(a[1:3])
print(a[0:5])

BC
ABC D


In [None]:
# some string operation examples

a = "ABC defg"
print(a.lower())
print(a.upper())
print(a.find('d'))
print(a.replace('de','a'))
print(a)
b = a.replace('def','aaa')
print(b)
b = b.replace('a','c')
print(b)
b.count('c')


abc defg
ABC DEFG
4
ABC afg
ABC defg
ABC aaag
ABC cccg


3

## Derived datatypes in `Python`

> `Lists,set,Tuples, Dictionaries`

### 1. Lists
Lists are exactly as the name implies. They are lists of objects. The objects can be any data type (including lists), and it is allowed to mix data types. In this way they are much more flexible than arrays. It is possible to `append, delete, insert` and `count` elements and to `sort`, `reverse`, etc.

>Example

In [None]:
a_list = [1,2,3,"this is a string",5.3]
b_list = ["A","B","F","G","d","x","c",a_list,3]
print(b_list)
print(b_list[7:9])

['A', 'B', 'F', 'G', 'd', 'x', 'c', [1, 2, 3, 'this is a string', 5.3], 3]
[[1, 2, 3, 'this is a string', 5.3], 3]


In [None]:
a = [1,2,3,4,5,6,7]
a.insert(0,0)
print(a)
a.append(8)
print(a)
a.reverse()
print(a)
a.sort()
print(a)
a.pop()
print(a)
a.remove(3)
print(a)
a.remove(a[4])
print(a)

[0, 1, 2, 3, 4, 5, 6, 7]
[0, 1, 2, 3, 4, 5, 6, 7, 8]
[8, 7, 6, 5, 4, 3, 2, 1, 0]
[0, 1, 2, 3, 4, 5, 6, 7, 8]
[0, 1, 2, 3, 4, 5, 6, 7]
[0, 1, 2, 4, 5, 6, 7]
[0, 1, 2, 4, 6, 7]


**list comprehensions:** Lists can be constructed using `for` and some conditional statements like $\lambda-$ functions. These are called, 'list comprehensions'. For example:



In [None]:
even_numbers = [x for x in range(10) if x % 2 == 0]
print(even_numbers)


[0, 2, 4, 6, 8]


>List comprehensions can work on strings as well:

1. create a list of letters from the sentence `Pyworks 1.0 will helps me learn Python`

In [None]:
first_sentence = "Pyworks 1.0 will help me learn Python."
characters = [x for x in first_sentence]
print(characters)

['P', 'y', 'w', 'o', 'r', 'k', 's', ' ', '1', '.', '0', ' ', 'w', 'i', 'l', 'l', ' ', 'h', 'e', 'l', 'p', ' ', 'm', 'e', ' ', 'l', 'e', 'a', 'r', 'n', ' ', 'P', 'y', 't', 'h', 'o', 'n', '.']


>**Map function:** Another similar feature is called `map`. Map applies a function to a list. The syntax is
```
map(aFunction, aSequence)
```

Consider the following examples:

In [None]:
def sqr(x): return x ** 2

In [None]:
a = [2,3,4]
b = [10,5,3]
c = map(sqr,a)


In [None]:
print(list(c))

[4, 9, 16]


In [None]:
d = map(pow,a,b)
print(list(d))
  

[1024, 243, 64]


## Tuples
Tuples are like lists with one very important difference. Tuples are *not changeable (mutable)*.

>Syntax

```
tuple_name=(item1,item2,item3,...)
```

In [None]:
a = (1,2,3,4)
print(a)
#a[1] = 2 # will produce an error

In [None]:
a = (1,"string in a tuple",5.3)
b = (a,1,2,3)
print(a)
print(b)

>One other handy feature of tuples is known as 'tuple unpacking'. Essentially, this means we can assign the values of a tuple to a list of variable names, like so:

In [None]:
my_pets = ("Chestnut", "Tibbs", "Dash", "Bast")
(aussie,b_collie,indoor_cat,outdoor_cat) = my_pets
print(aussie)
cats=[indoor_cat,outdoor_cat]
print(cats)
type(cats)


Chestnut
['Dash', 'Bast']


list

In [None]:
type(my_pets)

tuple

### Sets
Sets are unordered collections of unique elements. Intersections, unions and set differences are supported operations. They can be used to remove duplicates from a collection or to test for membership. For example:

In [None]:
A={1,2,3,4,6}
print(A)
B=set([1,2,4,5,2,3,5,2,4,5,6,7,1])
print(B)

{1, 2, 3, 4, 6}
{1, 2, 3, 4, 5, 6, 7}


In [None]:
fruits = set(["apples","oranges","grapes","bananas"])
citrus = set(["lemons","oranges","limes","grapefruits","clementines"])
citrus_in_fruits = fruits & citrus   #intersection
print(citrus_in_fruits)
diff_fruits = fruits - citrus        # set difference
print(diff_fruits)
diff_fruits_reverse = citrus - fruits  # set difference
print(diff_fruits_reverse)
citrus_or_fruits = citrus | fruits     # set union
print(citrus_or_fruits)

{'oranges'}
{'apples', 'grapes', 'bananas'}
{'limes', 'grapefruits', 'lemons', 'clementines'}
{'oranges', 'bananas', 'grapes', 'limes', 'lemons', 'clementines', 'grapefruits', 'apples'}


In [None]:
a_list = ["a", "a","a", "b",1,2,3,"d",1]
print(a_list)
a_set = set(a_list)  # Convert list to set
print(a_set)         # Creates a set with unique elements
new_list = list(a_set) # Convert set to list
print(new_list)        # Obtain a list with unique elements 

['a', 'a', 'a', 'b', 1, 2, 3, 'd', 1]
{1, 2, 3, 'd', 'a', 'b'}
[1, 2, 3, 'd', 'a', 'b']


## Dictionaries
Dictionaries are unordered, keyed lists. Lists are ordered, and the index may be viewed as a key.

In [None]:
a = {'anItem': "A", 'anotherItem': "B",'athirdItem':"C",'afourthItem':"D"} # dictionary example
print(a)

{'anItem': 'A', 'anotherItem': 'B', 'athirdItem': 'C', 'afourthItem': 'D'}


In [None]:
Features={'Interior':['wardrob','modular kitchen','corner sofa'],'Exterior':['sitout','balcony','fountain','garden']}

In [None]:
print(Features['Interior'])

['wardrob', 'modular kitchen', 'corner sofa']


In [None]:
for i in Features['Interior']: print(i)

wardrob
modular kitchen
corner sofa


## Classes
A class bundles data (known as attributes) and functions (known as methods) together. We access the attributes and mehtods of a class using the `.` notation. Since everything in `Python` is an object, we have already been using this attribute acccess - e.g. when we call `hello.upper()`, we are using the `upper` method of the instance `hello` of the string class.

### Modules
As the code base gets larger, it is convenient to organize them as `moudles` or `packages`. At the simplest level, modules can just be regular `python` files. We import functions in modules using one of the following import variants:
```
import numpy
import numpy as np # using an alias
import numpy.linalg as la # modules can have submodules
from numpy import sin, cos, tan # bring trig functions into global namespace
from numpy import * # frowned upon because it pollutes the namespace
```

## Exercises

1. Solve the FizzBuzz probelm

"Write a program that prints the numbers from 1 to 5. But for multiples of three print “Fizz” instead of the number and for the multiples of five print “Buzz”. For numbers which are multiples of both three and five print “FizzBuzz”.

In [None]:
for i in range(1,16):
  if ((i%3==0 ) & (i%5==0)):
    print("FizzBuzz")
  elif (i%3==0):
    print("Fizz")
  elif(i%5==0):
    print("Buzz")
  else:
    print(i)
    

1
2
Fizz
4
Buzz
Fizz
7
8
Fizz
Buzz
11
Fizz
13
14
FizzBuzz


2. Given x=3 and y=4, swap the values of x and y so that x=4 and y=3.

In [None]:
x=3
y=4
t=x
x=y
y=t
print(x,y)


4 3


3.  Using a dictionary, write a program to calculate the number times each character occurs in the given string s. Ignore differneces in capitalization - i.e 'a' and 'A' should be treated as a single key. For example, we should get a count of 7 for 'a'.

In [None]:
# initializing string
test_str = "Malayalam"
test_str=test_str.lower() 
# using set() + count() to get count
# of each element in string
res = {i: test_str.count(i) for i in set(test_str)}
 
# printing result
print(("The count of all characters in {} is {}").format(test_str,res))

The count of all characters in malayalam is {'l': 2, 'y': 1, 'a': 4, 'm': 2}
