 ## Data Analysis Using [Python](https://www.python.org) - Basic Data Types

In [11]:
#Python is dynamically typed language
#Dynamically typed programming languages do type checking at run-time as opposed to Compile-time. 

iam_integer = 100
iam_float = 3.14
iam_str = "Hellow"
iam_bool = True
iam_complex = 3+4j

### Everything in Python is an object
### Everything in Python has a type
### **type** and **object** are special objects in python


In [7]:
print(type(iam_integer))
print(type(iam_float))
print(type(iam_str))
print(type(iam_bool))
print(type(iam_complex))

<class 'int'>
<class 'float'>
<class 'str'>
<class 'bool'>
<class 'complex'>


### What Are Namespaces In Python?
#### A namespace is a simple system to control the names in a program. It ensures that names are unique and won’t lead to any conflict.
### Local Namespace
#### This namespace covers the local names inside a function. Python creates this namespace for every function called in a program. It remains active until the function returns.
### Global Namespace
#### This namespace covers the names from various imported modules used in a project. Python creates this namespace for every module included in your program. It’ll last until the program ends.
### Built-in Namespace
#### This namespace covers the built-in functions and built-in exception names. Python creates it as the interpreter starts and keeps it until you exit.

<img src="./namespace.jpg" alt="Python Namespace" height="542" width="542" align="left">

In [12]:
a = 2
a = a +1
b = 2

<img src="./namespace_example.png" alt="namespace" height="550" width="550" align="left">

* Every objects has an identity which is going to be unique
* **variable a** in the namespace points to object 2
* **variable a** in the namespace points moves to object 3 
* new name b is created in the namespace and points to object 2

In [13]:
print(id(a))
print(id(b))
print(id(2))

94263333794496
94263333794464
94263333794464


In [14]:
# There are many other names in this namespace which are brought in by Jupyter
print(dir())

['In', 'Out', '_', '__', '___', '__builtin__', '__builtins__', '__doc__', '__loader__', '__name__', '__package__', '__spec__', '_dh', '_i', '_i1', '_i10', '_i11', '_i12', '_i13', '_i14', '_i2', '_i3', '_i4', '_i5', '_i6', '_i7', '_i8', '_i9', '_ih', '_ii', '_iii', '_oh', 'a', 'b', 'exit', 'get_ipython', 'iam_bool', 'iam_complex', 'iam_float', 'iam_integer', 'iam_str', 'my_dir', 'quit']


In [15]:
# %load utilities.py
#!/usr/bin/env python

def c(mylist):
    import re
    pattern = re.compile('_[0-9a-z]+')
    return [x for x in mylist if not pattern.match(x) and 
            x not in ('In','Out','_','__','___','exit','quit','get_ipython')] 

In [17]:
#import utilities
print(my_dir(dir()))

['__builtin__', '__builtins__', '__doc__', '__loader__', '__name__', '__package__', '__spec__', 'a', 'b', 'iam_bool', 'iam_complex', 'iam_float', 'iam_integer', 'iam_str', 'my_dir']


## [Python Library Reference](https://docs.python.org/3.8/library/index.html)
#### [Built-in Functions](https://docs.python.org/3.8/library/functions.html) - Loaded when python is started
#### [Standard Library](https://docs.python.org/3.8/library/) - These are installed along with standard python installation, ex: sys, os, etc
#### [External modules](https://pypi.org/) - Can be downloaded from the Python Package Index, ex: numpy, pandas, etc

# Numbers

In [15]:
2 + 2

4

In [16]:
5 * 3

15

In [17]:
100/21

4

In [18]:
100/21.0

4.761904761904762

In [19]:
import math
radius = 10 # 10 centimeters
area = math.pi * radius**2
print(area)

314.159265359


# Strings

## Strings in python are immutable

In [20]:
str_a = 'This string uses single quotes'

str_b = "This string uses double quotes"

str_c = """This is a multi line string
This is second line
This is third line
"""

str_d = "This doesn't contain escape characters"
str_e = 'There are some "SPECIAL" words in this sentence'
str_f = 'It is fine to use escape character\'s some times'

# There are some special character \t, \n, etc

str_g = "Everything in Python is an object\nEvery object in Python has type\nPython is dynamically typed language"

In [21]:
print(str_g)

Everything in Python is an object
Every object in Python has type
Python is dynamically typed language


# String methods

* ### Strings can be indexed
* ### startswith, endswith
* ### strip, split, replace, partition
* ### index,count, find
* ### upper, lower
* ### join, format
* ### string slicing


In [22]:
my_string = "PythonExpress: brings tutors, $organizations and students: togetherto spread the love of Python"

In [23]:
len(my_string) # returns length of string

95

In [24]:
my_string.startswith("Python") # Returns True or False

True

In [25]:
my_string.endswith("Python") # Returns True or False

True

In [26]:
"   This is a test String ".strip() # removes the leading and trailing spaces

'This is a test String'

In [27]:
"This line has return line characters at the end\n\n\n".strip("\n")

'This line has return line characters at the end'

In [28]:
print(my_string.split()) # default delimiter is space

['PythonExpress:', 'brings', 'tutors,', '$organizations', 'and', 'students:', 'togetherto', 'spread', 'the', 'love', 'of', 'Python']


In [29]:
print(my_string.split(":")) # passing a delimiter
#What is the type of output of split ?
# What if there is no delimiter present in my_string ? would split operation fail ?

['PythonExpress', ' brings tutors, $organizations and students', ' togetherto spread the love of Python']


In [30]:
my_string.find("$") # returns the index first occurance of character $

30

In [31]:
my_string.count("students") # returns the number of occurances of word/character

1

In [32]:
my_string.count("python") # returns 0 if the substring is not found

0

In [33]:
my_string.upper() # conversts to uppercase

'PYTHONEXPRESS: BRINGS TUTORS, $ORGANIZATIONS AND STUDENTS: TOGETHERTO SPREAD THE LOVE OF PYTHON'

In [34]:
my_string.index("tutors") # returns the starting index position of the sting

22

In [35]:
# string slicing
print(my_string[0:15])  # returns the character starting from zero till 15 ( excluding 15)
print(my_string[10:25]) # returns the character starting from 10 till 25 ( excluding 25)
print(my_string[25:])   # starting with 25 till the end of the string
print(my_string[:25])   # starting from the begining till 25 ( excluding 25)
print(my_string[:])     # complete string

PythonExpress: 
ess: brings tut
ors, $organizations and students: togetherto spread the love of Python
PythonExpress: brings tut
PythonExpress: brings tutors, $organizations and students: togetherto spread the love of Python


In [36]:
# String concatenation

my_statement = "This" + " " + "is" + " a " + "test statemet"
my_statement

'This is a test statemet'

In [37]:
print("*"*3 + " Title " + "*"*3)

*** Title ***


In [39]:
# what happens if string "Title" is divided by 3 ?
# what happens if string integer 5 is added to string "Title" ?

# Exercises

## Explore string "Monty Python"

<img src="./fig_list_index.png" alt="namespace" height="550" width="550" align="left">


* ### Find the lenght of string "String in Python is an array of characters"
* ### How many occurance of "people" word are there in below sentence

### Company's goal is to make financial expertise broadly accessible and effective in helping people live the lives they want. With assets under administration of \$5.2 trillion, including managed assets of \$2.1 trillion as of April 30, 2015, we focus on meeting the unique needs of a diverse set of customers: helping more than 24 million people invest their own life savings, nearly 20,000 businesses manage employee benefit programs, as well as providing nearly 10,000 advisory firms with technology solutions to invest their own clients' money.

* ### Extract substring "assets under administration of \$5.2 trillion"  from above sentence using indicies
* ### Remove "." from the above sentence and split the sentence using "," as the delimiter



# Lists

* ### List is the  most versatile compound data type, which can be written as a list of comma-separated values (items) between square brackets
* ### Lists in python are mutable
* ### Items of list can be any python object
* ### [List methods](https://docs.python.org/3.8/tutorial/datastructures.html#more-on-lists): append, extend, insert, remove, pop, index, count, sort, reverse
* ### in statement to check the presence of an element

In [74]:
# list can have different types of objects
my_list = ['Python','java',25,32,43.55,'C++']

In [75]:
len(my_list) # returns lenght of the list

6

In [76]:
my_list[1] # returns second element of list

'java'

In [77]:
my_list[1] = 'Java' # list are mutable

In [78]:
my_list

['Python', 'Java', 25, 32, 43.55, 'C++']

In [79]:
new_list = my_list[1:4] # list slice

In [80]:
new_list

['Java', 25, 32]

In [81]:
my_list.append("DotNet") # appends string at the 

In [82]:
my_list

['Python', 'Java', 25, 32, 43.55, 'C++', 'DotNet']

In [83]:
my_list.extend(['R','SPSS','MATLAB']) # extending a list using another list

In [84]:
my_list

['Python', 'Java', 25, 32, 43.55, 'C++', 'DotNet', 'R', 'SPSS', 'MATLAB']

In [85]:
# list can contain duplicate items
my_list.append("Python")

In [86]:
print(my_list)

['Python', 'Java', 25, 32, 43.55, 'C++', 'DotNet', 'R', 'SPSS', 'MATLAB', 'Python']


In [87]:
my_list.count("Python")

2

In [88]:
my_list

['Python',
 'Java',
 25,
 32,
 43.55,
 'C++',
 'DotNet',
 'R',
 'SPSS',
 'MATLAB',
 'Python']

In [89]:
print(my_list)

['Python', 'Java', 25, 32, 43.55, 'C++', 'DotNet', 'R', 'SPSS', 'MATLAB', 'Python']


In [90]:
# Please do not run this multiple times, pop removes an element each time
last_element = my_list.pop()
last_element

'Python'

In [91]:
my_list.index("Java")

1

In [92]:
'Python' in my_list # checking if object is present inside the list

True

In [93]:
# this modifies the original list, in place reverese
my_list.reverse()

In [94]:
print(my_list)

['MATLAB', 'SPSS', 'R', 'DotNet', 'C++', 43.55, 32, 25, 'Java', 'Python']


In [95]:
# inserting at 1st position ( Please note the index starts at 0)
my_list.insert(1,'SPSS')

In [96]:
print(my_list)

['MATLAB', 'SPSS', 'SPSS', 'R', 'DotNet', 'C++', 43.55, 32, 25, 'Java', 'Python']


In [97]:
# List of Lists
list_of_lists = [['Python','C++','Java'],[2.7,4.2,8.0],['Object',2.5],'Main']
list_of_lists

[['Python', 'C++', 'Java'], [2.7, 4.2, 8.0], ['Object', 2.5], 'Main']

In [98]:
list_of_lists[0]

['Python', 'C++', 'Java']

In [99]:
# Accessing list
for item in my_list: # iterates from the first element to last element
    print("Programming Language : ", item)

Programming Language :  MATLAB
Programming Language :  SPSS
Programming Language :  SPSS
Programming Language :  R
Programming Language :  DotNet
Programming Language :  C++
Programming Language :  43.55
Programming Language :  32
Programming Language :  25
Programming Language :  Java
Programming Language :  Python


# Tuple

* ## Tuples are very similar to Lists except that they are not mutable
* ## A tuple consists of a number of values separated by commas enclosed in round brackets
* ## Tuples can contain mutable objects like lists

In [100]:
my_tuple = 'Equity', # observe the comma at the end

In [101]:
my_tuple

('Equity',)

In [102]:
another_tuple = ('Equity Fund','1 Year',13.5)

In [103]:
another_tuple[1]

'1 Year'

In [107]:
len(another_tuple)

3

In [108]:
tuple_list = ([1,2,3],['a','b','c'],'Another String')

In [109]:
tuple_list[0].append(4)

In [110]:
tuple_list

([1, 2, 3, 4], ['a', 'b', 'c'], 'Another String')

# Sets

* ## A set is an unordered collection with no duplicate elements
* ## Basic uses include membership testing and eliminating duplicate entries
* ## Support mathematical operations like union, intersection, difference, and symmetric difference.

In [112]:
instrument_types = ['Equity','Fixed Income','Equity','Money Market']
instrument_types_set = set(instrument_types)

In [113]:
instrument_types_set

{'Equity', 'Fixed Income', 'Money Market'}

In [114]:
another_set = {'Fixed Deposits','Equity'}

In [115]:
instrument_types_set.union(another_set)

{'Equity', 'Fixed Deposits', 'Fixed Income', 'Money Market'}

In [116]:
instrument_types_set.intersection(another_set)

{'Equity'}

In [117]:
instrument_types_set - another_set

{'Fixed Income', 'Money Market'}

In [118]:
instrument_types_set ^ another_set #items in instrument_types_set or another_set but not both

{'Fixed Deposits', 'Fixed Income', 'Money Market'}

# Dictionaries

* ## Associative arrays or Hash tables
* ## An unordered set of key: value pairs
* ## Key should by any immutable type

In [119]:
my_dict = {'name':'Python','version':2.7,'objects':['List','Tuple','Set']}

In [120]:
my_dict_list = dict([('x',20),('y',40)])

In [83]:
my_dict_list

{'x': 20, 'y': 40}

In [84]:
person = dict(name='Mark',age=25,language='English')

In [85]:
person

{'age': 25, 'language': 'English', 'name': 'Mark'}

In [86]:
my_dict['name']

'Python'

# Exercises

* ## Explore [**range**](https://docs.python.org/2/tutorial/controlflow.html#the-range-function) function
* ## Explore [**del**](https://docs.python.org/2/tutorial/datastructures.html#the-del-statement) statement
* ## Explore [**format**](https://docs.python.org/2/tutorial/inputoutput.html#fancier-output-formatting) function

# Exercise - 1
## There are 5 items in a buffet  - Roti , mushroom curry, Salads, Palak Panner, Veg Palao
## Create a list in the orde these items are arranged for buffet
## Iterate over the list and print the item sequence numer and name (hint : explore enumarte function)

# Exercise - 2
```python
a = [1,2,4,3,5]
```
## re-arrage the oder of elments  to 1,2,3,4,5 using del bultin function and insert list method