# Introduction to Python for Beginners and me 

##Before starting with real stuff ... a few words on how to navigate the notebook:

* There are two main types of cells : Code and Text
* In "code" cells "#" at the beginning of a line marks the line as comment
* In "code" cells every non commented line is intepreted
* In "code" cells, commands that are preceded by % are "magics" and are special commands in Ipython to add some functionality to the runtime interactive environment.
* Shift+Return shortcut to execute a cell
* Alt+Return   shortcut to execute a cell and create another one below

## And remember that :
* Python is an interpreted language
* Indentation has a syntactic meaning ( we'll talk about this in few min )
* Indexes start from 0  ( similar to C )

### 1. Operating with Numbers

In [None]:
## we can use python as a calculator 
6*7  #this is a comment

In [None]:
6+7

In [None]:
6-7

In [None]:
## note that we are doing calculation among integers
6/7

In [None]:
## let's make sure that the interpreter understands we want to use floating point
6./7

In [None]:
## use print to display messages on the screen
x = 6./7
print "6./7 = " , x 
## this is a nicer way to print with a specific format
print "6./7 = %g" % x 
## note that the format matters 
print "6./7 = %d" % x 

In [None]:
## let's try to be 'python3 friendly'
from __future__ import print_function

In [None]:
print("6./7 = %g" % x )

In [None]:
## You don't need to define the type of variable. The interpreter will guess.
a=6
b=7
print (a*b , a+b, a-b, a/b )

## As in the previous example, if one element is floating point, the interpreter will do an automatic cast
print 
a=6.   ## this is now float
b=7
print (a*b , a+b, a-b, a/b )

In [None]:
## well, let's figure out how big can an integer be
## "import" loads additional modules into the interpreter environment 
import sys
#sys.maxsize gives the maximum integer 
sys.maxsize 



In [None]:
## you can probably guess how many bits an integer uses :)
2**62 + ( 2**62 -1 )

In [None]:
## there is another integer type that has "unlimited precision"
sys.maxsize + 1 

In [None]:
## and what is a float for the interpreter
sys.float_info

## float is implemented using  "double" in C  

In [None]:
sqrt(4)

In [None]:
## the math module provides the basic math function from  C standard
import math 
math.sqrt(4)

In [None]:
## we can import from a module specific symbols into the current namespace and use them directly
print (math.exp(2.))

from math import exp
print (exp(2.))

In [None]:
## in case you need complex numbers 
z=3.5+4j

print ("Re(z) = %g" % z.real)
print ("Im(z) = %g" % z.imag)
## when we want to fit more than one value in a formatted string 
print ("Re(z) = %g ; Im(z) = %g " % ( z.real , z.imag ))


In [None]:
## to do standard math operation on complex numbers you have to use the module cmath
import cmath
cmath.sqrt(-4)

In [None]:
## Some basic logic operators 
a = 2 
b = 3 
print ("a = " ,a )
print ("b = " ,b)

## == stands for "is equal to"  
## be careful and do not confuse 
## ==  which is an operator that compares the two operand
## with = , which is an assignment operator.
print ("a == b is " , a == b )

## != "not equal to" 
print ("a != b is " , a != b )

## greater and smaller than
print ("a < b is " , a < b )
print ("a > b is " , a > b )

## the basic boolean types.
print ("True is ... well ... " , True)
print  ("...and obviously False is " , False )

## That's enough numbers for now. Let have fun with strings

In [None]:
## a string is just a sequence of characters within quotes "" or '' 
mystr = "My name is Francesco"
print (mystr)

## it does not matter if you use '' or "" , as long as you don't mix them
mystr = 'My name is Francesco'
print (mystr)

In [None]:
## String are easy to manipulate in python 
## we can for example measure their length 
mystr = 'my string'
len(mystr)

In [None]:
## or we can extract substrings
mystr = 'my string'

print (mystr[3])

print (mystr[-1])

print (mystr[0:9])

print (mystr[3:])

print (mystr[3:-1])


In [None]:
## dir() gives information on the attributes of an object 
dir(mystr)

In [None]:
## to learn more about something we can use help()
help(mystr.replace)

In [None]:
## or the fancy ipython version "?"
mystr.replace?

In [None]:
## manipulating strings is very easy 
mystr = 'My name is Francesco'

## finding the location of a substring
print (mystr.find("name"))

## changing to uppercase
print (mystr.upper())

## replacing substrings
print (mystr.replace('name','brother'))

## these operations do not modify the original string
print (mystr)

In [None]:
## we can count the occurrences of a letter
print (mystr.count('a'))

## note that the string comparison is case sensitive
print ("the letter f occurrs %d times " % mystr.count('f') )
print ("the letter F occurrs %d times " % mystr.count('F') )



In [None]:
#print mystr 
#print mystr.replace('name','brother')

mystr.lower().count('f')

In [None]:
## string addition and multiplication
print ("Hello, m" + mystr[1:])

print ("Ciao " *2 + ", m" + mystr[1:])

In [None]:
## "in" returns a boolean
print ("Fra" in mystr)
print ("Tim" in mystr)

In [None]:
## .split() separates fields 
print (mystr.split())

print (mystr.split("a"))

## Lists : ordered collections of stuff

In [None]:
## list is an ordered  collection of objects
list1 = [ "Francesco" , "Italy" , True , 6 ]

print (len(list1))
print (list1)  

In [None]:
list2 = []
list2.append("Marc")
list2.append("Germany")
list2.append(False)
list2.append(3)
print (list2)

In [None]:
print (list1)
print (list1[1]) 
print (list1[-1]) 
print (list1 [0:2]) 

In [None]:
print (list1.index('Italy'))

In [None]:
## sort() sorts lists in place
list1.sort()
print (list1)

In [None]:
list1.extend(list2)
print (list1)

In [None]:
## "in" returns a boolean
4 in list2

## Programs control flow  : Where Indentation matters!!

## if statements :

In [None]:
I_lived_in = [ "Italy" , "United States" ]
Marc_lived_in = [ "Germany" , "Austria" , "United States" ]
#Marc_lived_in = [ "Germany" , "Austria"]

print ("I lived in %d places and Marc in %d " % (len(I_lived_in) , len(Marc_lived_in) ))

## IMPORTANT : the indentation is used to identfy code blocks.
if len(I_lived_in) > len(Marc_lived_in) :
    res = "I lived in lots of places"
    print (res)
elif len(I_lived_in) == len(Marc_lived_in) :
    print ("nothing")
else :
    print ("marc is cooler")


##for loops:

In [None]:
## range(i,j) retuns integers beween i and j-1 
range(1,11)

In [None]:
## sum of first 10 integers

result=0
for i in range(1,11):
    result+=i
    
print (result)

In [None]:
shopping_list=["bananas", "chocolate", "carrots"]

In [None]:
## lists are iterators. 
## We can automatically loop through list elements without using indexes
for thing in shopping_list:
    print ("today I purchased some" , thing)
             

In [None]:
a="ciao"
b="Jes"
print (a+b)
print (a , b )

## Other useful buitins : tuples , enumerate and zip :


###A 'tuple' is an "immutable list". Once created cannot be changed.

In [None]:
##         Name    Major    Nationality  Glasses?  
person = 'John', 'Biology', 'American' , False     # tuple packing 

print (person)

## you can use parenthesis to enclose the tuple, but that's not necessary.
person = ('John', 'Biology', 'American' , False )    # tuple packing 

print (person)

In [None]:
print (person[0:3],person[3])

print ( "name = %s ; major = %s ; nationality = %s " % person[0:3] )

In [None]:
## the values in a tuple can be ‘unpacked’ into the variables
name, subject , nationality  = person[:3]   # tuple unpacking

print ("name = " , name , " ; nationality = ", nationality )



In [None]:
## single elements of a tuple cannot be changed. The tuple is immutable
person[3]=True


In [None]:
## We can however reassing the tuple to a new set of values.
print (person)
person = person[0:3] + (True,)
print (person)

### In case you need to iterate over element of a list and their index, you can use the builtin 'enumerate' 

In [None]:
## in case you need to iterate over element 
## of a list and their index, you can use the builtin 'enumerate' 

people = ['John','Fra','Hanna','Camelia']

for a in people :
    print ( a  )

    
for a in people :
    print ( 'Name ', people.index(a)  , ' is ' ,a )


In [None]:
for i,a in enumerate(people) :
    print ( 'Name ' , i , ' is ' , a )

In [None]:
a=enumerate(people)

In [None]:
print (a.next())

###The builtin 'zip' can be used to combine elements of multuple lists into a list of tuples.

In [None]:
people     = ['John'   ,'Fra'    ,'Hanna'    ,'Camelia']
major      = ['Biology','Physics','Chemistry','Biochemistry']
has_glasses= [  True   , True    , False     , False]

zip(people,major,has_glasses)

In [None]:
## zip is very convenient to 'connect' lists for example inside for loops.
for a,b,c in zip(people,major,has_glasses):
    print (  a + " studies " + b + " and "  +  ( " wears " if c else " doesn't wear " )  + "glasses" )

## How to define your own functions


In [None]:
## let's introduce a simple funtion 

def myfunction(x):
    return x*5

## let's test our function
y=myfunction(3.5)
print (y)



In [None]:
## note that the argument of the function can be anything that makes sense to python
y=myfunction("Hello ")
print (y)


## Simple file operations:
We will illustrate simple file operations with the following exercise:

* The files text1.txt and text2.txt contains the full text of the novel "Le Petite Prince" translated in two languages.
* first open the files and read in the text
* then compute the frequency of letters in the text 
* write an output file "frequencies.txt" containing the letter frequencies.
* generate a bar plot for the frequencies and compare them with a reference one for the english language that you can find on Wikipedia https://en.wikipedia.org/wiki/Letter_frequency . Can we recognize which of the two files contains the text in English?

In [None]:
## you can use some basic linux commands as long as they are the only thing in the cell :

In [None]:
## Make sure that the present working directory is the working folder where you downloaded the example files.

In [None]:
pwd

In [None]:
ls

In [None]:
## if you are not in the same folder where your downloaded files were, please go in there:

In [None]:
##cd 'c:\\Users\\your_user\\correct_folder\'

In [None]:
cat simple_text.txt

In [None]:
## we open the file 
filename='simple_text.txt'
try:
    myfile = file(filename)
except IOError:                     
    print ("Cannot open file %s " % filename)

In [None]:
## let's read a line 
a=myfile.readline()
print (a)

In [None]:
## "rewind" the file
myfile.seek(0)

In [None]:
help(myfile.read)

In [None]:
## we can read in the whole file with 'read'
mytext=myfile.read()

In [None]:
mytext[:60]


In [None]:
## using the method replace we can remove newlines and returns
mytext=mytext.replace('\n', ' ').replace('\r', ' ')

In [None]:
print(mytext)

In [None]:
## define the filename we would like to process
filename='text1.txt'

## try to open the file
try:
    myfile = file(filename)
except IOError:                     
    print ("Cannot open file %s " % filename)

## read in the text and remove the newline character   
mytext=myfile.read().replace('\n', ' ').replace('\r', ' ')
    
alphabet_letters = ["a","b","c","d","e","f","g","h","i","j","k","l","m","n","o","p","q","r","s","t","u","v","w","x","y","z"]

## start with an empty list
frequencies = []

## create the list of frequencies counting the occurrence of each letter
for i in alphabet_letters :    
    frequencies.append(mytext.lower().count(i))

## normalize     
frequencies = ([ x / (sum(frequencies)*1.) for x in frequencies  ])

print(frequencies)


In [None]:
#del mytext
#del myfile

In [None]:
## we are only doing it for 2 files, but in case we want to reuse this , it makes sense to define a function that 
## takes as input the filename and returns the frequencies.

def count_letters(myfilename):
    try:
        myfile = file(myfilename)
    except IOError:                     
        print ("Cannot open file %s " % myfilename)
        return []
        
    alphabet_letters = ["a","b","c","d","e","f","g","h","i","j","k","l","m","n","o","p","q","r","s","t","u","v","w","x","y","z"]

    frequencies = []
    
    mytext=myfile.read().replace('\n', ' ').replace('\r', ' ')

    for i in alphabet_letters :    
        frequencies.append(mytext.lower().count(i))

    fr = ([ x / (sum(frequencies)*1.) for x in frequencies  ])
    
    return fr


In [None]:
## we can now call the function just defined with the two different filenames as input
lett_freq = count_letters('text2.txt')

print (lett_freq ,sum(lett_freq))

In [None]:
alphabet_letters = ["a","b","c","d","e","f","g","h","i","j","k","l","m","n","o","p","q","r","s","t","u","v","w","x","y","z"]

first_language=count_letters('text1.txt')
second_language=count_letters('text2.txt')

print(first_language)
print(second_language)

In [None]:
#print  (zip(alphabet_letters,first_language,second_language))

In [None]:
## we can open a file and write the results 
myoutfile = open('Letter_frequency.txt', 'w')

for a in zip(alphabet_letters,first_language,second_language):
    myoutfile.write('%s\t%lf\t%lf\n'% a)
    
myoutfile.close()

In [None]:
cat Letter_frequency.txt

In [None]:
ls

In [None]:
%matplotlib inline

import matplotlib
import numpy as np
import matplotlib.pyplot as plt

In [None]:
plt.rcParams['figure.figsize'] = 12,12

In [None]:
## We will now plot using some of the plotting functionality provided by matplotlib 
fig, axes = plt.subplots(nrows=2)
names = ['Language 1', 'Language 2']
x = np.arange(len(first_language))

## create an histogram
for ax, freq, name in zip(axes, [first_language,second_language], names):
    ax.bar(x, freq )
    ax.set(xticks=x+0.5, title=name)
    ax.set_xlim([0,26])
    ax.set_xticklabels(alphabet_letters)

###Which one is English?  https://en.wikipedia.org/wiki/Letter_frequency

### A supercool data structure in python are  'dictionaries' 


In [None]:
## people       'John'   ,'Fra'    ,'Hanna'    ,'Camelia'
## major      'Biology','Physics','Chemistry','Biochemistry'

## a dictionary is composed by pairs of "keys" : " values" 
my_dictionary = {'John':'Biology' , 'Fra':'Physics' ,'Hanna': 'Chemistry' ,'Camelia': 'Biochemistry'}

In [None]:
print (my_dictionary)

In [None]:
print (  "the keys are " , my_dictionary.keys())
print ( "the values are " , my_dictionary.values() )

In [None]:
## you can extract the "value" addressing the corresponding "key"
print (my_dictionary['Camelia'])

In [None]:
## another way for building a dictionary is to start from an empty one and fill in the entries.

my_dictionary = {}
my_dictionary['Hanna'] = 'Chemistry'
my_dictionary['John'] = 'Biology'

print (my_dictionary)

In [None]:
print (my_dictionary['John'])

In [None]:
## a dictionary can be created from a list of tuples
## in this simple case a list with just one tuple
mylist_tuple = [('Fra', 'Physics')]
print (mylist_tuple)

##use the keyword 'dict' to convert that list into a dictionary.
mydict = dict(mylist_tuple)
print(mydict['Fra'])

###How can we use a dictionary to makes our life simpler when searching the results in the "language" example?

In [None]:
letters_freq = dict(zip(alphabet_letters , first_language,  ) )

In [None]:
letters_freq['a']

In [None]:
letters_freq = dict(zip(alphabet_letters ,[ [ x,y ]for x ,y  in zip(first_language, second_language) ] ) )

In [None]:
letters_freq['a']

In [None]:
pwd

In [None]:
%pprint

In [None]:
%reset