# __¿Why Python as a tool?__

It is important to differentiate the Python implementation from the language itself. The language implementation is a system for executing the computer programs, and Python uses the _interpretation approach_, where the program is read as input by an interpreter, which performs the actions written in the program. Here you will learn the basic concepts of Python as a programming language. Let's begin

### Content
- Lexical structure
    - Lines and Indentation
    - Tokens
    - Statements
- Data Types
    - Numbers
    - Strings
    - Strings (str)
    - Tuples
    - Lists 
    - Sets
    - Dictionaries 
- Variables and other references
- Control Flow Statements
    - The _if_ Statement
    - The _while_ statement
    - The _for_ Statement
    - The _break_ and _continue_ statements
- Functions
    - Defining Functions: The _def_ statement
    - Parameters
    - Attributes of Function objects
    - Function Annotations
    - The return Statement
    - Lambda Expressions
- Python Ecosystem

Python has several features that make it well suited for learning (and doing) data science:
- It 's open source, which means it is free.
- It works similarly in most applications, so it’s a predictable language which make it desirable in programming.
- It’s relatively simple to write and to maintain (and, in particular, to understand). Due to its pragmatism it is widely accepted by programmers. 
- It’s useful in many applications. It’s been designed as a general-purpose programming language, then it can cooperate with a variety of other software components, making it the right language for gluing together code written in other languages.
- It’s simpler, faster to process (both for humans and for tools). This is a very high-level language (VHLL), that affords high programmer productivity, making Python a strong development tool.
- It’s an object-oriented programming language that allows you some functional programming procedural style too. 
- It has lots of useful data science–related libraries. 

At the time of this writing, the __Classic Python__ - also known as CPython, and often known as just Python - is the most up-to-date, solid and complete production-quality implementation of the language. CPython is a bytecode compiler, interpreter, and set of built-in and optional modules, all coded in standard C. 

Another interesting implementation of CPython is IPython, which enhances CPython interactive interpreter to make it more powerful and convenient. IPython has been refactored, now morphed into __*Jupyter Notebooks*__, an interactive programming environment that, among snippets of code, also lets you embed commentary in literate programming style and show the output of executing code. 

There is more to Python programming than just the language. There are plenty of libraries and extensions that suit almost any application. Most of the modules are fully functional in different versions of Python, and let code access functionality supplied by the underlying operating system or other software components such as graphical user interfaces (GUI’s), databases, and networks.Extensions also afford great speed in computationally intensive tasks such as XML parsing and numeric array computations, which is specially suitable for Data Science.

Being proficient in Python up to a fluent software developer standar is not needed to process data successfully. To be a high performer Data Scientist / Data Analyst, it’s recommended to get a solid foundation of the basis, understood as the lexical structure, data types, variables, and control flow statements and functions.

The distribution of Python I recommend the most is Anaconda. This distribution package compiles the Python standard libraries, some external extensions and the IPython implementation, so you are ready to tackle data analysis tasks. 

When doing data science with Python, your code is expected to be written in a _Pythonic way_, meaning it should be concise and efficient. Pythonic code is often associated with the use of list comprehensions, which are ways to implement useful data processing functionality with a single line of code.


## __Lexical structure__

The lexical structure of a programming language is the set of basic rules that govern how you write programs in that language. It is the lowest-level syntax of the language, specifying such things as what variable names look like and how to denote comments. 

Each Python source file is a text file, and it indicates the sequence of lines, tokens, or statements. 

### Lines and Indentation

A Python program is a sequence of *logical lines*, each made up of one or more physical lines. A physical line may end with a comment indicated by a hashtag ( `#` ) sign placed any place not inside a string literal. All characters after the `#`, up to but excluding the line end, is the comment: Python ignores them. 


In [17]:
# This is a single-line comment 
# There are no double-line \ 
# comments 

Python does not use delimiters, such as semicolon ( ; ) to denote the end of physical lines, the line end denotes the end of most statements. However, a logical line can be constituted by two or more physical lines but those must use a concatenator sign, such as backslash ( `\` ), an open parenthesis ( `(` ), bracket ( `[` ) or brace ( `{` ).  Physical lines after the first one are called *continuation lines*. Triple-quoted string literals can also span physical lines but those are mostly used into sql applications and longer comments into special applications.

In [18]:
# This is the most simple single-physical line logical line. In this case, an assignment 
variable = 5

# This is a two-physical line statement into a single logical line. 
# In this case it's an assignment of a data type called list to a variable.
variable = [1, 2,\
            3, 4] # This is the continuation line.

Python uses _indentation_ to express the block structure of a program. Blocks of code (statements) are denoted with the usage of indentation rather than braces, or other begin/end delimiter. A __block of statements__ is a contiguous sequence of logical lines, all indented by the same amount, and a logical line with less indentation ends the block. All the statements in a block must have the same indentation, as must all the clauses in a compound statement. 

In [19]:
# This is the first block of code. It is used to declare and initialize two variables
num1 = 6
num2 = 9

# This is the second block of code.This one prints the output of the sum
sum = num1 + num2
print('This is the output:', sum)

# In this case, both blocks have the same indentation. 

This is the output: 15


Python treats each tab as if it was up to 8 spaces, nevertheless the standard python style is to use four spaces per indentation level. __You must be careful because Python does not allow mixing tabs and spaces for indentation.__

### Tokens

These are the _elementary lexical components_ of a logical line. Tokens correspond to a substring of the logical line separated by whitespace. In the absence of whitespace, Python would parse them as a single longer identifier. The normal token types are _identifyers, keywords, operators, delimiters, and literals_.

__Identifiers__

These are names used to identify variables, functions, class names, modules or other objects. They always start with a letter or an underscore ( `_` ).Case is significant: lowercase and uppercase are distinct, and punctuation characters such as  `@` , `$` , and `!` are not allowed. 

Normal Python style is to start class names with an uppercase letter, and most other identifiers with lowercase  letters. Starting an identifier with a single leading underscore indicates by convention that the identifier is meant to be private. Starting an identifier with a double underscore indicates a strongly private identifier; if the identifier also ends with two trailing underscores, however, this means that it’s a language-defined special name. 

In [20]:
first_variable = 200    #Variables are named with lowercase letters. 
                        # Composed dentifiers are chained with an underscore.
_private_variable = 100 #This is a private variable

print("This is a regular variable: ",first_variable,", This is a private variable: ", _private_variable)

This is a regular variable:  200 , This is a private variable:  100


__Delimiters__

Python uses the following characters as delimiters and combinations as delimiters in various statements, expressions, and list, dictionary, and set literals and comprehensions, among other purposes. ` ’ ` and ` “ ` surround string literals.

In [80]:
(   )   [   ]   {   }
,   :   .   =   ;   @
+=  -=  *=  /=  //= %=
&=  |=  ^=  >>= <<= **=

SyntaxError: invalid syntax (1885509909.py, line 1)

__Keywords__

These are 35 reserved identifiers in Python for syntactic uses, that why they are sometimes known as *reserved words*. As any other identifiers, those words are case sensitive. They can be all listed by importing the keyword model and printed as follows:

In [21]:
import keyword 
print(keyword.kwlist)

['False', 'None', 'True', '__peg_parser__', 'and', 'as', 'assert', 'async', 'await', 'break', 'class', 'continue', 'def', 'del', 'elif', 'else', 'except', 'finally', 'for', 'from', 'global', 'if', 'import', 'in', 'is', 'lambda', 'nonlocal', 'not', 'or', 'pass', 'raise', 'return', 'try', 'while', 'with', 'yield']


__Literals__

These are direct denotations in a program of a data value (a number, string, or container). The following are number and string literals in Python: 


In [22]:
35             #Integer literal
3.14           #Float literal
1.0j           #Imaginary literal
'Hello'        #String literal 
"world"        #Another string literal 
"""Good
night"""       #Triple-quoted strin literal, spanning two lines

'Good\nnight'

Combining numbers and string literals with the appropriate delimiters, you can directly build many container types with those literals as values:


In [23]:
[25,4.52,'Hello']       #This is a list
[]                      #This is an empty list
200,305,567             #This is a tuple
(200,305,567)           #This is another tuple
()                      #This is an empty tuple
{'a':5, 'b':67}         #This is a dictionary
{}                      #This is an empty dictionary
{2, 5, 6, 7, 'letra'}   #This is a set
#There is no literal to denote an empty set. 

{2, 5, 6, 7, 'letra'}

__Operators__

Python uses non alphanumeric characters and character combinations as operators. They generally act in conjunction of an expression, which is defined as a *phrase* of code evaluated by python to produce a value. The simplest expressions are literals and identifiers, and you can build other expressions by joining subexpressions with the operators and/or delimiters. The simplest operators are those who represent simple math operations: 


In [24]:
#These are variables
a=15
b=3       

# Then we use operators as follows

# Sum:
c = a + b
# Subtraction:
d = a - b
# Product 
e = a * b
# Division
f = a / b

print('The result of the sum is: ', c)
print('The result of the subtraction is: ', d)
print('The result of the product is: ', e)
print('The result of the division is: ', f) # The result of a division is always a float

The result of the sum is:  18
The result of the subtraction is:  12
The result of the product is:  45
The result of the division is:  5.0


### Statements 

There are two types of statements, single and compound statements. 

__Simple Statements__

A simple statement is one that contains no other statements. It lies entirely within a logical line. In Python, you may place more than one simple statement in a logical line, with a semicolon (` ; `) as a separator. However, it is recommended to use one simple statement per line to increase readability. Any expression can be on its own as a simple statement. 

An *assignment* is a simple statement that assigns values to variables. We use the ` = ` operator, and can never be part of an expression. In that case, the ` := ` (walrus) operator is needed.


In [25]:
# This is a simple statement
variable = 345.76j

__Compound Statements__

A compound statement contains one or more other statements and controls it's execution. These statements are also known as blocks. It has one or more *clauses*, aligned at the same indentation. Each clause has a *header* starting with a keyword and ending with a colon ( : ), followed by a *body*, which is a sequence of one or more statements, are on separate logical lines after the header line, indented four spaces rightward.

In [26]:
# This is a compound statement 
for i in range(10):
    print("This is the ", i, "iteration of the loop")

This is the  0 iteration of the loop
This is the  1 iteration of the loop
This is the  2 iteration of the loop
This is the  3 iteration of the loop
This is the  4 iteration of the loop
This is the  5 iteration of the loop
This is the  6 iteration of the loop
This is the  7 iteration of the loop
This is the  8 iteration of the loop
This is the  9 iteration of the loop


## __Data Types__

In Python, any Data value is considered an _object_. Objects can be of many types, the categories under which data values are classified. Those categories can be built-in types in Python or personalized ones, the latter are also called _classes_. 

There are six types of built-in types in Python: *numbers, strings, lists, tuples, dictionaries, and sets*. Those types can be mutable or immutable. An immutable object means that it can not be altered or modified through operations, therefore, when you perform an operation on an immutable object, you produce a new immutable object or do not get a new result at all. 

### Numbers

Numeric types in Python include integers, floating-point numbers, and complex numbers. These are all immutable objects, and that means they produce a new number object when operations are applied on them. It's important to know that numeric literals do not include a sign; a leading ( + ) or ( - ), if present, I'd use a separate operator. 

- __Integers:__ Integers in Python can be decimal or non-decimal. A decimal literal is a sequence of digits whose first digit is a nonzero. The Decimal literal is the most regularly used integer literal in data science, however, there are specific applications where _binary, octal, or hexadecimal_ literals are needed.

- __Floating-point numbers:__ these literals are a sequence of decimal digits that includes a decimal point ( . ), an exponent suffix ( e or E ), or both. 

- __Complex numbers:__ these are literals made of two floating-point numbers, one arch for real and imaginary parts. The imaginary part is identified with the constant ( j ) added to it, which means a square root of -1. 

In [27]:
1 , 4 , 456, 4325678        #Are all integers
3.2, 345.59, 24567.456789   #Are all floating-piont numbers
1-3j, 3.2+943j              #Are all complex numbers

((1-3j), (3.2+943j))

*Iterables* is the Python concept that captures in abstract the iteration behavior of __sequences__, which are ordered containers of items, indexed by integers. The built-in types in Python are strings, lists and tuples. 

### Strings (str)

The *str* object is a sequence of characters used to store and represent text-based information. These objects are immutable: when an operation is performed on a str object, the result is always a new string object, rather than mutating an existing string. 

A variant of string literals are *raw string literals*, where escape sequences are not implemented. There are quotes literals immediately preceded by an ( r or R ). Raw string literals come in handy for strings that include many backslashes, especially regular expression patterns and Windows absolute filenames (which use backslashes as directory separators). 

In [28]:
'This is a literal string'
"This is another literal string"

'This is a string that \
spans two lines '                       #Comments not allowed on the previous line. 

"""
Using triple quoted lines we can:
use "quoted expresions into a string"
and also spanning several lines """     #Comments not allowed on previous lines.

r"C:\Users\main_user\directory_name"             #This is the recomended form to call directories in Python when using windows.

'C:\\Users\\main_user\\directory_name'

### Tuples

A tuple is an immutable object. That means that once created, it can not be changed. It can use mutable objects such as lists as tuple items, but best practice is generally to avoid doing so.  To denote a tuple use a series of expressions (the items of a tuple) separated by commas ( , ) and can optionally be enclosed in parentheses ` ( ) `.

Tuples are typically used to store collections of heterogeneous data; that is, data of different types. They are specially useful when you need a structure to hold the properties of a real world object. 


In [29]:
23, 531, 4667           #This is a tuple with 3 items. Parentheses optional
(2.72,)                 #This is a tuple with one item. needs trailing comma
()                      #This is an empty tuple. Parenthesis not optional
tuple('wow')            #Built-in function that creates a new tuple ('w', 'o', 'w')
x = [1,2,4,6]
tuple(x)                #This creates and returns a tuple whose items are the same as those in x

(1, 2, 4, 6)

### Lists

A list is a mutable ordered sequence of items, meaning you can add, remove, and modify a list's elements. The items of a list are arbitrary objects and may be of different types. Lists can have duplicate elements. To denote a list, use a series of expressions (the items of the list) separated by commas ( , ), within brackets. 

Although Python does allow you to have elements of different data types in the same list, best practice suggests using lists to contain elements that represent a series of usually related, similar things that can be grouped together. A typical list contains only elements belonging to a single category (that is homogeneous data, such as people's names, article titles, or participant numbers). 

In [30]:
[42, 3.14, 'hello']             #List with three items. Brackets [ ] are mandatory
['title']                       #List with only one item
[]                              #Empty list
list('wow')                     #Built-in function that creates the list ['w', 'o', 'w']
x = 'letras'
list(x)                         #This creates and returns a list whose items are the same as those in x

['l', 'e', 't', 'r', 'a', 's']

Operations and methods on list are very useful in data analysis. Let's talk see some of them:

In [31]:
# One oway to modify a single item of a list is by assigning to an indexing
x = [1, 2, 3, 4, 49, 76]
x[1]= 42                        #This modify the original list x by replacing 42 in the position 1. 
x                               #x is now [1, 42, 3, 4]

[1, 42, 3, 4, 49, 76]

In [32]:
# Another way to modify a list object is to use a slice as the target of an assignment statement.

x[1:4]= [81, 55, 63, -8]        #This modify the the positions 1,2,3 and 4. 
x                               #x is now [1, 81, 55, 63, -8, 49, 76]

[1, 81, 55, 63, -8, 49, 76]

In [33]:
x[3::3]=[0, 34]                 #This is called slicing. This modify the list from the position 3 up to the end, with a step of 3 elements. 
x                               #x is now [1, 81, 55, 0, -8, 49, 34]

[1, 81, 55, 0, -8, 49, 34]

In [34]:
del x[2]                        #This eliminates the second position of the list, hence, reduce its lenght. 
x                               #x i now [1, 81, 0, -8, 49, 34]

[1, 81, 0, -8, 49, 34]

In [35]:
x.count(0)                      #This method returns the number of items of x that are equal to 0. In this case 1. 

1

In [36]:
x.index(0)                      #This method returns the index of the first occurrence of an item in x. In this case is 2.

2

In [37]:
len(x)                          #This returns the amount of elements in a list. x contains 6 elements.

6

In [38]:
x.append(8)                     #This method add the value 8 to the end of the list. This means create a new position 6 and put 8 in it.
x                               #The list x is now [1, 81, 0, -8, 49, 34, 8]

[1, 81, 0, -8, 49, 34, 8]

In [39]:
x.clear()                       #This remove all items from the list, leaving x empty. 
x

[]

In [40]:
a = [11, 12, 13]
x.extend(a)                     #This appends all the items of the list a to the end of the list x. 
x                               #X is now [11, 12, 13]

[11, 12, 13]

In [41]:
x.insert(1,89)                  #This method insert item 89 in the position 1, moving following items of x righwards increasing . 
x                               #x is now [11, 89, 12, 13]

[11, 89, 12, 13]

In [42]:
x.pop(-2)                       #This method returns the value of the item at index -2 and remove it form x. 
x                               #x is now [11, 89, 13]

[11, 89, 13]

In [43]:
x.remove(13)                    #This method removes the first occurrence of the value 13 from x. 
x                               #x is now [11, 89]

[11, 89]

In [44]:
x.reverse()                     #This method reverse, in place, the items of a list. 
x                               #x is now [89, 11]

[89, 11]

In [45]:
x.sort()                        #This method sorts, in place, the items of a list (in ascending order by default). 
                                # x is now [11, 89]
x                               #If reverse order is required, then the argument reverse=True is needed.

[11, 89]

### Sets

A Python set is an unordered collection of unique items. Duplicate items are not allowed in a set. To denote a *set*, you can use a series of elements separated by commas ( , ) within braces ( `{ }` ).

In [46]:
{42, 3.14, 'hello'}             #Set with three items. Brackets [ ] are mandatory
{'title'}                       #Set with only one item
set()                           #Empty set. {} in an empty dictionary
set('wow')                      #Built-in function that creates the set {'o', 'w'}
x = 'letras'
set(x)                          #This creates and returns a set whose items are the ordered letters

{'a', 'e', 'l', 'r', 's', 't'}

Python provides a variety of operations applicable to sets. Let's see them:

In [47]:
# Take a look to the s set as follows:
s = {'male', 'doctor' ,34 ,'engineering','America','Florida'}

'male' in s                     #This operation validates if a value is contained in a set

True

In [48]:
len(s)                          #This method returns the number of elements in a set

6

In [49]:
m = s.copy()                    #This method returns a copy whose items are the same objects of the original list.
print('The shallow copy is the set ',m)
print('While the original set is ',s)

The shallow copy is the set  {34, 'Florida', 'engineering', 'male', 'America', 'doctor'}
While the original set is  {34, 'Florida', 'engineering', 'male', 'America', 'doctor'}


In [50]:
m.remove('male')            #This operation removes an element as an item of the set. 
m                           #Raises an error if the item is not found in the set
                            #The copied set is now {34, 'America', 'Florida', 'doctor', 'engineering'}

{34, 'America', 'Florida', 'doctor', 'engineering'}

In [51]:
#We can compare sets by calling:
s.difference(m)             #This returns the set of all items of s that aren't in m

{'male'}

In [52]:
s.intersection(m)           #This method returns the common elements in both sets

{34, 'America', 'Florida', 'doctor', 'engineering'}

In [53]:
m.issubset(s)               #This method returns true when when all items of m are also in s
                            #This means m is a subset of s

True

In [54]:
m.issuperset(s)             #This method returns true when all items of m are also in s
                            #This means there are elements in s that are not in m

False

In [55]:
k = {'veterans'}
s.isdisjoint(k)             #This method returns true if the intersection of those sets is the empty set
                            #This means there are no common elements

True

In [56]:
k.add('male')               #This method adds one, and only one, element to a set
k

{'male', 'veterans'}

In [57]:
k.add('Washington')
k.symmetric_difference(s)   #This method returns the set of all items in either sets, but not both.

{34, 'America', 'Florida', 'Washington', 'doctor', 'engineering', 'veterans'}

In [58]:
k.union(s)                  #This method returns the set of all items that are in both sets

{34,
 'America',
 'Florida',
 'Washington',
 'doctor',
 'engineering',
 'male',
 'veterans'}

In [59]:
s.discard('Florida')        #This method eliminates 'Florida' as an item of the set s
s                           #This has no effect if the item is not part of the set already

{34, 'America', 'doctor', 'engineering', 'male'}

### Dictionaries

Dictionaries are the only single mapping type provided by Python. Dictionaries are mutable, unordered collections of _key-value_ pairs, where each key is a unique name that identifies an item of data, the value. Each key is separated from its value by a colon ( : ), and key-value pairs are separated by commas ( , ), within braces ( `{ }` ). Dictionaries, like tuples, are useful for storing heterogeneous data about real-world data.

In [60]:
{'x':42, 'y':3.14, 'z':35}      #This is a dictionary with 3 items, str keys
{1:56, 34:964}                  #This is a dictionary with 2 items, int keys
{23:'za', 'br':235}             #This is a dictionary with 2 items, different keys
{}                              #This is an empty dictionary
dict()                          #Built-in function that also produce an empty dictionary
dict(x=42,y=3.14,z=35)          #Built-in function that creates the dictionary {'x': 42, 'y': 3.14, 'z': 35}


{'x': 42, 'y': 3.14, 'z': 35}

Python provides a variety of operations applicable to dictionaries. Let's see the following:

In [61]:
# Since dictionaries are containers, the built in function can take dictionaries as its argument. 
d = {'age':25, 'food':'fish', 'sport':'soccer', 'state':'New york', 'childs':2, 'status':'single'}

len(d)                          #This function returns the number of key-pair in a dictionary.

6

In [62]:
# Keys can be chacked in a dictionary using as follows:
'food' in d                     #In this case returns True because the key 'food' exist in d

True

In [63]:
#To denote a value ina dictionary currently associated with a key, we use indexing. 
d['sport']                      #This is the way we access to values

'soccer'

In [64]:
#Adding a new key is as easy as follows:
d['floor']='second'
#The dictionary now shows a new key-value pair 'floor':'second' 
d

{'age': 25,
 'food': 'fish',
 'sport': 'soccer',
 'state': 'New york',
 'childs': 2,
 'status': 'single',
 'floor': 'second'}

In [65]:
# A shallow copy of the dictionary can be obtained with 
h = d.copy()
h                           #The new dictionary h is a copy of d

{'age': 25,
 'food': 'fish',
 'sport': 'soccer',
 'state': 'New york',
 'childs': 2,
 'status': 'single',
 'floor': 'second'}

In [66]:
#Items (key-value) pairs can be deleted by using del statement
del d['childs']
d

{'age': 25,
 'food': 'fish',
 'sport': 'soccer',
 'state': 'New york',
 'status': 'single',
 'floor': 'second'}

In [67]:
d.items()                   #This method returns an iterable object whose items are all current items.

dict_items([('age', 25), ('food', 'fish'), ('sport', 'soccer'), ('state', 'New york'), ('status', 'single'), ('floor', 'second')])

In [68]:
d.keys()                    #This method returns all the keys in a dictionary.

dict_keys(['age', 'food', 'sport', 'state', 'status', 'floor'])

In [69]:
d.values()                  #This method returns all the values in a dictionary.

dict_values([25, 'fish', 'soccer', 'New york', 'single', 'second'])

In [70]:
d.pop('status')             #This method remove and returns the value when the key is in a dictionary.

'single'

In [71]:
# The new dictionary doesn't contain the 'status':'single'
d                          

{'age': 25,
 'food': 'fish',
 'sport': 'soccer',
 'state': 'New york',
 'floor': 'second'}

In [72]:
d.setdefault('food','meat')     #Returns the value if the key exist in the dictionary. 
                                #If doesn't exist the item in the dictionary, sets the new key to the passed value.

'fish'

In [73]:
d.popitem()                     #This method remove and returns the items from d in last-in, first-out order.
                                #In this case returns the last added item.

('floor', 'second')

There are two special data types worth knowing, __None__ and the __Ellipsis ( … )__. 

The None denotes a null object, and has no methods or other attributes. Its suitable to use None as a placeholder when a reference is needed but you don't care what object you refer to, or when you need to indicate that no object is there. Functions return None as their result unless they have specific return statements coded to return other variables. None can be used as a dictionary key.

The Ellipsis, written as three periods with no intervening spaces ( … ), is a special object used in numerical applications or as an alternative to None when None is a valid entry. 


## Variables and other references.

Python accesses data values through *references*. A reference is a name that refers to a value (object). References take the form of variables, attributes, and items. 

__Variables__ is the name used to reference a value (object). The existence of a variable begins with a statement that binds the variable (in other words, set a name to hold a reference to some object), that's why there are no declarations in Python. A variable has no intrinsic value, the type of variable is defined by the object it refers to. The __del__ statement unbinds a variable reference, although doing so is rare. Any identifier can be used to name a variable except the 30-plus reserved keywords.

Attributes and Items are identifiers applicable to an object. An __Attribute__ is a function called on an object through the use of an attribute name preceded by a period ( `.` ). An __item__ is also a form to get information from an object by the usage of an index or key in a set of brackets ( `[ ]` ) added to it. Attributes and items are widely used in data science libraries as pandas to get information from objects. 

Assignment statements can be plain or augmented. Plain assignment to a variable is how you create a new variable or rebind an existing variable to a new value, attribute or item. Augmented assignment cannot, per se, create new references, but it can rebind a variable or and an attribute.


## __Control Flow Statements__

Control Flow Statements are those who regulate the order in which the program’s code executes. The main control flow statement structures are *conditional statements, loops, and functions*. 

### The _if_ Statement

This is a compound statement that lets us conditionally execute blocks of lines only when some criteria is met, or choose statements to execute depending on mutually exclusive conditions. 

In [74]:
w = 4
if w < 0:                                   # w < 0 is the expression evaluated; If positive, then the immediate block is run
    print('w is negative')
elif w % 2:                                 # When the if expression is evaluated negative, the elif clauses are evaluated. 
    print('w is positive and odd')          # The elif clauses are optional, but there can be as many elif clauses as needed.
else:                                       # The else clauses usually indicates the non evaluated posibilities.
    print('w is even and nonnegative')

w is even and nonnegative


When the _if_ clause's condition evaluates as true, the statements within the _if_ condition clause execute, then the entire _if_ statement ends. Otherwise, Python evaluates each _elif_ clause's condition in order. When an _else_ clause exist, it executes. In any case, statements following the entire _if_ construct, at the same level, execute next. 

### The _while_ statement

The __while__ statement repeats execution of a statement or block of statements for as long as a conditional expression evaluates as true. It can also include an else clause and break and continue statement.


In [75]:
i = 456
count = 0
while i > 0:
    i //= 2                                 #floor division
    count += 1
print('The aprroximate log2 is', count)

The aprroximate log2 is 9


First Python evaluates a expression, which is known as the loop condition, in a boolean context. When the condition evaluates as *False*, the __while__ statement ends. When the loop condition evaluates as true, the statement or block of statements that make up the loop body executes. 

> Take care not to introduce a code that never ends in a condition statement. This will produce a process that never ends and, consequently, that your system runs out of resources.  


### The _for_ Statement

The _for_ statement repeats execution of a statement or block of statements controlled by an iterable expression. The __in__ keyword is part of the syntax of the __for__ statement; its purpose here is distinct  from the __in__ operator, which tests membership. 

In [76]:
for letter in 'ciao':
    print(f'give me a {letter}')

give me a c
give me a i
give me a a
give me a o


A __for__ statement can also include an else clause and a break and continue statements. An _iterable_ expression may be any iterable Python expression. In particular, any sequence is iterable. 

In our previous example, _letter_ is the target. A __target__ is normally an identifier naming the control variable of the loop; the for statement successfully rebinds this variable to each item of the iterator, in order. The statement or statements that make up the loop body execute once for each item in _iterable_, unless the loop ends because of an exception.

In [77]:
for key,value in h.items():
    if key and value:
        print(key,value)

age 25
food fish
sport soccer
state New york
childs 2
status single
floor second


An _iterator_ is an object _i_ such that you can call next(i), which returns the next item of iterator _i_ or, when exhausted, raises a StopIteration exception. 

The function __range(x)__ generates the consecutive integers from 0 (included) up to x (excluded). This is the simplest way to loop n times in Python.

### The __*break*__ and __*continue*__ statements

The __*break*__ statement is used only within a loop body. when _break_ executes, the loop terminates without executing any __else__ clause on the loop. When loops are nested, a break terminates only the innermost nested loop. In practice, a break is typically within a clause of an __if__ statement in the loop body, so that break executes conditionally.

In [78]:
r = 1
while r > 0:                             
    print(r)
    r += 1
    if r > 5:                       # This loop can never terminate "naturally" 
        break

1
2
3
4
5


The __*continue*__ statement can exist only within a loop body. It causes the current iteration of the loop body to terminate, and execution continues with the next iteration of the loop. In practice, a *continue* is usually within a clause of an _if_ statement in the loop body, so that continue executes conditionally. 

The _while_ and _for_ statements may optionally have a trailing else clause. The __else__ statement executes when the loop terminates naturally, but not when the loop terminates prematurely (via break, return, or an exception). 

In [79]:
# This example ilustrates the usage of continue and else clauses after a loop:

f = [1,2,3,4,5,6]
for r in f:
    if r < 5:
        print(r)
    elif r == 5:
        continue                # In this case, it does nothing
    else:
        print('The higher value in the list is: ', 6)
    r += 1

else:
    print('This loop terminates naturally')

1
2
3
4
The higher value in the list is:  6
This loop terminates naturally


The body of a compound statement cannot be empty, it must always contain at least one statement. You can use a `__pass__` statement, which performs no action, as an explicit placeholder when a statement is syntactically required but you have nothing to do.

## __Functions__

A __function__ is a group of statements that execute upon request. A request to execute a function is known as a _function call_. There are many advantages of using functions: clarity, readability and code reusability, all improve when you avoid having any substantial chunks of module-level code.

Functions can receive arguments that specify data upon which the function performs its computation. 

In Python, a function always returns a result: either None or a value, the result of the computation. Functions are treated as objects (values) in Python, which means that like any other object, can be bound to a variable, can be an item in a container, and can be an attribute of an object. Functions can also be keys in a dictionary. 

### Defining Functions: The __def__ statement

The __def__ statement is a single-clause compound statement used to create a function. It's syntax is:


In [1]:
def function_name(parameters): 
    Statemens(s)

function_name is an identifier, and the non-empty indented statement(s) are the function body. When the interpreter encounters a __def__ statement, it compiles the function body, hence creating a function object. 

*parameters* is an optional list specifying the identifiers that will be bound to values that each function call provides. We have to distinguish between parameters and *arguments*, the latter are the values provided for the parameters in function calls. 

The function's *signature* defines how you call the function, and it includes the number of parameters of it, together with the parameters' names, the number of mandatory parameters, and the information on whether and where unmatched arguments should be collected. 

### Parameters

Better known as Formal parameters, name the values passed into a function call, and may specify default values for them. Each time you call the function, the call binds each parameter name to the corresponding argument value in a new local namespace, which Python later destroys on function exit.

For example, considers this function from the _random_ module: 

In [4]:
def sample(population, k, *, counts = None):
        pass         

When calling a sample, values for population and k are mandatory, and may be passed positionally or by name. Counts are optional; if you do pass it, you must pass it as a named argument.

Generally speaking, positional parameters are followed by named parameters, with the positional and named argument collectors (if present) last. The positional-only marker; however, may appear at any position in the list of parameters. 

### Attributes of Function objects

The __def__ statement sets some attributes of a function object *f*: 

f.__name__ is the identifier that def uses as the function's name. It can be reminded to any string value.

f.__defaults__ is the tuple of default values for named parameters.

f.__doc__ (__Docstrings__), called *documentation string*, is the function's attribute you may use to explain what a function does and it's parameters. Docstrings can span multiple physical lines, so it's best to specify them in triple-quoted string literal forms.


In [6]:
def sum_sequence (*numbers):
    """Return the sum of multiple numerical arguments.
    
    The arguments are zero or more numbers. 
    The result is their sum.
    """
    return sum(numbers)

### Function Annotations

_Annotations_ are expressions used either to annotate a parameter or the return value of the function. The Annotations attribute of the function object is a dict mapping each annotated identifier to the respective annotation. 


In [7]:
def f( a : 'foo' , b ) -> 'bar' : pass

Annotations do not perform any operation; however, this optional step helps us remember useful information during development and maintenance of the code. These tools are useful to identify and locate data type mismatched in functions arguments and return values. 

Annotations can be made on every def's parameter list, but you can alternatively use the form *identifier: expression, and the expression's value becomes the annotation for that parameter. 

In [9]:
f.__annotations__
{ 'a' : 'foo', 'return': 'bar' }

{'a': 'foo', 'return': 'bar'}

### The return Statement

You can use the __return__ keyword in Python only inside a function body, and you can optionally follow it with an expression. When returns executes, the function terminates, and the value of the expression is the function result. 

A function returns __None__ when it terminates by reaching the end of it's body, or by executing a return Statement with no expression (or by explicitly executing return None). *Never* write a return statement without an expression at the end of a function body. Return None should only ever be written when no expression is expected.

It's recommended to explicitly indicate positional and named arguments. When *keyword-only parameters* are presented, they must be passed as named parameters.


### Lambda Expressions

A __lambda__ expression is the anonymous equivalent of a normal function whose body is a single return statement. This form is optional to the def / return couple. The lambda syntax does not include the return keyword. 

__lambda__ parameters: expression

This form can sometimes be handy when you want to use an extremely simple function as an argument or return value.


In [10]:
a_list = [ -2, -1, 0, 1, 2 ]

sorted(a_list, key = lambda x: x * x)       # returns: [0, -1, 1, -2, 2]


[0, -1, 1, -2, 2]

While __lambda__ can at times be handy, __def__ is usually better: it's more general and helps you make your code more readable, since you can choose a clear name for the function.

Python's world is gigantic! I've covered here the basics you need to know as a data scientist using Python; however, I can not strengthen enough the importance of reading the official documentation available at  https://docs.python.org/3/ . Please feel free to take a look whenever needed. 

## __Python Ecosystem__

The Python ecosystem of libraries, frameworks, and tools is enormous and growing. Python is used for web scraping, data analysis, web development, internet of things development (IoT), machine learning, DevOps, general scientific computing, and many other computing and scripting uses. The main libraries I'll be exploring in this study material are:

_Pandas:_ For data analysis.
_Matplotlib:_ foundational library for visualization. <br>
_Numpy:_ The numeric library that serves as the fundation of all calculation in Python. <br>
_Seaborn:_ A statistical visualization tool built on top of matplotlib. <br>
_Statsmodel:_ A library with many advanced statistical functions. <br>
_Scipy:_ Advanced scientific computing, including functions for optimization, linear algebra, image processing and more. <br>
_Scikit-Learn:_ The most popular machine learning library for python (not deep learning) <br>

Among many other tools for specific use-cases.

## __Summary__

Python is a great tool for learning (and doing) data science because it's free, simple to write and understand, it interacts well with other programming languages, it's Object Oriented, and it has has lots of useful data science–related libraries. In Python, any Data value is considered an _object_. There are six types of built-in types in Python: numbers, strings, lists, tuples, dictionaries, and sets.

Control Flow Statements are those who regulate the order in which the program’s code executes. The main control flow statement structures are conditional statements, loops, and functions. 

A function is a group of statements that execute upon request. There are many advantages of using functions: clarity, readability and code reusability, all improve when you avoid having any substantial chunks of module-level code.

The Python Ecosystem is huge because there are libraries for almost any action you need. 