In [None]:
# This cell is used to change parameter of the rise slideshow, 
# such as the window width/height and enabling a scroll bar

from notebook.services.config import ConfigManager
cm = ConfigManager()
cm.update('livereveal', {
              'width': 1000,
              'height': 1500,
              'scroll': True,
})

# Week 2: Introduction to Python for Data Science

## Why do you Need Python for Data Science?

### Improves Work for Everyone

Widely used top programming language

Huge growing ecosystem due to its open source nature

Almost every industry is on board

### Descriptive Analytics and Dashboards

Exploratory data analysis

Manipulation of data

Streamline work flows

Creating visualisations/dashboards (i.e. plotly, streamlit)

![Fig. 1. Example of a dashboard in Python](https://www.dropbox.com/s/730bet2q74ankrf/dashboard.jpg?raw=1)

### Machine Learning

Predicting and classifying new data

Recommender systems

Can work with popular Google machine learning libraries (such as Tesseract and Tensorflow)

### Predictive/Prescriptive Analytics

Decision science
    - Anticipate what, when and why certain outcome will happen
    - What to do with information

Deep learning to optimise outcomes

![Fig. 2. Prescriptive Analytics](https://www.dropbox.com/s/wpleczxpzz1bub9/prescriptive.jpg?raw=1)

## Some statistics

### Popularity

Python is the third most popular programming language according to the [TIOBE](https://www.tiobe.com/tiobe-index/) index, being the fastest growing one in this rubric for the current year

![Fig. 3a. TIOBE Index 2019](https://www.dropbox.com/s/2s2hbucml60purg/tiobe.jpg?raw=1)

![Fig. 3b. TIOBE Index 2019](https://www.dropbox.com/s/qrglvt9sjtio6py/tiobepython.jpg?raw=1)

According to the 2019 developer survey run by [Stack overflow](https://insights.stackoverflow.com/survey/2019), Python is the 4th most popular programming language in the world, both for general public and for professional developers

![Fig. 4. Most popular technologies](https://www.dropbox.com/s/qio08bridd9hxvh/stackoverflowmostpopular.jpg?raw=1)

Python is currently the best ranked programming language according to the Institute of Electrical and Electronics Engineers  [(IEEE)](https://spectrum.ieee.org/at-work/innovation/the-2018-top-programming-languages)

![Fig. 5. IEEE Ranking](https://www.dropbox.com/s/z9t87sqsru1tk0b/stats3.jpg?raw=1)

### Employabilty

Python is currently the language with the fastest growing rate of interest by employers according to [Google Trends](https://medium.freecodecamp.org/best-programming-languages-to-learn-in-2018-ultimate-guide-bfc93e615b35)

![Fig. 6. Google trends interest over time](https://www.dropbox.com/s/dlwxeks6oth3uo2/stats2.jpg?raw=1)

It is the 12th best paid language, but one of the fastest to adopt

![Fig. 7a. Salary by programming language](https://www.dropbox.com/s/pn74qojkzrwcxi2/stackoverflowbestpaid.jpg?raw=1)

![Fig. 7b. Salary by programming language w.r.t. years of experience](https://www.dropbox.com/s/p606cnr28xmt2ac/salarybylanguage.jpg?raw=1)

### Reach/Scalability

![Fig. 8. Interconnectivity of different programming languages, IDEs and environments](https://www.dropbox.com/s/ehb524ls7pffirk/graph.jpg?raw=1)

## Fundamentals of Programming
How do we define Python?

### Levels of Programming Languages
![Fig. 9. Types of programming languages](https://www.dropbox.com/s/zgnryvnslnpw2qf/proglangtypes.jpg?raw=1)
[Source](http://4.bp.blogspot.com/-NvijJmjC13I/TmIbqlKKl8I/AAAAAAAAA3Q/mK4Nmy43en8/s1600/Untitled-1+%25281%2529.jpg)

#### Advantages of High-level Programming Languages

Programmer friendly

Easy to write, debug and maintain

Provide higher level of abstraction from machine languages

Machine independent language

Easy to learn

Less error prone

#### Disadvantages of High-level Programming Languages

Slower (takes additional time to translate)

Less memory efficient

Cannot directly communicate with the hardware

[Some literature](https://www.tutorialandexample.com/middle-level-language/) considers Java/C/C++ as *middle*-level languages, compared to Python as a high-level/very-high level one, due to their capacity of abstraction

![Fig. 10. Languages](https://www.dropbox.com/s/c7l7c21r4elknj2/middle-level-language.jpg?raw=1)

### Translation of Operations: Compiled vs Interpreted Programming Languages

#### Compiled Languages

The high-level source code is translated to machine code using a compiler

An addition + gets directly translated to the ADD instruction in the machine code

Examples: C, Fortran, COBOL, C++, and Java (compiled to bytecode)

##### Advantages of Compiled Languages

Ready to run

Often faster

Source code is kept private

#### Interpreted Languages

Instructions are not directly executed, but read by another program

Instructions run freely without the need to compile them first!

Examples: JavaScript, Perl, R, *Python*

##### Advantages of Interpreted Languages

Cross-platform (portability)

Simpler to test

Display error as each instruction is run

![Fig. 11. Difference between compiled and interpreted programming languages](https://www.dropbox.com/s/31cnw6n1mmrv04h/compiledinterpreted.jpg?raw=1)
[Source](https://www.google.com/url?sa=i&rct=j&q=&esrc=s&source=images&cd=&cad=rja&uact=8&ved=2ahUKEwichoGF1KXkAhVOdhoKHebmAJwQjRx6BAgBEAQ&url=%2Furl%3Fsa%3Di%26rct%3Dj%26q%3D%26esrc%3Ds%26source%3Dimages%26cd%3D%26ved%3D%26url%3Dhttps%253A%252F%252Fmedium.com%252Ffrom-the-scratch%252Fstop-it-there-are-no-compiled-and-interpreted-languages-512f84756664%26psig%3DAOvVaw0CqS9Nmdo4wbc9J-p4WtL-%26ust%3D1567083827896505&psig=AOvVaw0CqS9Nmdo4wbc9J-p4WtL-&ust=1567083827896505)

### Typing: Static vs Dynamic Programming Languages

Static is designed to optimise *hardware* efficiency

Dynamic is designed to optimise *programming* efficiency so that less code is used

In fact, dynamic languages are written using a static one!
    - Python is written in C!

### So What is Python?

A widely used **high-level**, **interpreted**, and **dynamic** programming language

Emphasizes code readability

Its syntax allows programmers to express concepts in fewer lines of code

Similar in syntax and purpose to R and Matlab

![Fig. 12. How to write an essay according to different programming languages!](https://www.dropbox.com/s/qeul7bjpljres1i/programminglanguages.jpg?raw=1)

# Installing Python

### The long and hard way

1. Install Python (https://www.python.org).
2. Install a Python Integrated Development Environment (IDE) such as IDLE (available when installing Python), Pycharm (https://www.jetbrains.com/pycharm/) or Spyder (https://pypi.org/project/spyder/).
3. Install Jupyter Notebook (http://jupyter.org/).

## The fast way: Anaconda Navigator

Everything can be easily installed using a bundle called [Anaconda Navigator](https://www.anaconda.com/download/).

![Fig. 13. Anaconda](https://www.dropbox.com/s/q1nkj5xrqz5qhrj/anaconda.jpg?raw=1)

# How Does Python Look Like?

In its most simplistic state, Python acts like a calculator. You simply write one calculation, and Python gives you the answer!

In [2]:
6+9

15

Moreover, you can also do some coding!

In [3]:
for i in range(10):
    print(i)

0
1
2
3
4
5
6
7
8
9


Notice the simplicity of the Python syntax, in the sense that we do not need to define classes or use a complex and strict structure of parenthesis!

In this course we will use Jupyter Notebook not only for lectures, but also for laboratory activities and to code/present the coursework.

## Data Types and Data Structures

Python contains a pre-defined set of **classes** which can contain certain data types or data structures

Instead of defining objects/variables/classes/constructors, one may simply type a number and Python will assume the type of this value

In [4]:
## Type anything and then run the cell to see if Python recognises the object
1

1

### Numerical Data Types

    * Integers
    * Float
    * Booleans
    * Hex
    * Oct
    * Complex
    * and many more to import...

#### Integers

The most basic data type in Python

A number by default is an integer if no decimal value is specified

The `type()` **function** can be used to discover the type of a variable or a number

You can use comparison operators to evaluate integer values

In [5]:
type(3)

int

In [6]:
## Writing exponential numbers
2e5

200000.0

In [7]:
x=3
print(type(x))

<class 'int'>


In [8]:
# See if two ints are equal
3==5

False

In [9]:
# See if two ints are not equal
3!=5

True

In [10]:
# See if a number is smaller than another
3<5

True

In [11]:
# See if a number is smaller or equal to another
3<=5

True

In [12]:
# See if a number is larger than another
3>5

False

In [13]:
# See if a number is larger or equal to another
3>=5

False

#### Booleans (logical operators)

In [14]:
type(True)

bool

In [15]:
z = True
print(type(z))

<class 'bool'>


#### Float

Is how we call decimals in Python

Up to 15 decimal places

In [16]:
y=6.91289739812784749872987

In [17]:
type(y)

float

In [18]:
# If more than 15 decimal places are used, python truncates
7.0789349236894739847398972348974238947

7.078934923689474

In [19]:
# We can also compare floats, but funny things may happen!
1.1+2.2==3.3

False

Do you know why?

![Fig. 14. Prove you are a robot!](https://www.dropbox.com/s/w9u524mgaezmxur/robot.jpg?raw=1)

#### The `isinstance()` function

In [20]:
isinstance(x,int)

True

In [21]:
isinstance(y,int)

False

You can also work with complex, binary, octal and hexadecimal numbers in Python.

#### Strings

In [22]:
a = 'Hi'
print(a)

Hi


In [23]:
# Any number inside quotations becomes a string
c = "45"
print(c)

45


In [24]:
# A boolean inside quotations becomes a string
b = "True"
print(b)

True


In [25]:
print(type(a),type(a),type(a))

<class 'str'> <class 'str'> <class 'str'>


### Conversion Functions

![Fig. 15. Conversion Functions](https://www.dropbox.com/s/77jrypjzpdpegkn/cf.jpg?raw=1)

In [26]:
# the int() function can convert any number into an integer. 
# Mostly used to round floats
int(7.8)

7

In [27]:
# Converting booleans into ints
int(True)

1

In [28]:
# float function adds a 0 decimal to an int
float(9)

9.0

In [29]:
# bool() turns any number to TRUE (other than 0)
bool(8)

True

In [30]:
bool(0)

False

# Data Structures

![Fig. 16. Python Data Structures](https://www.dropbox.com/s/c6uk6ptsnb7x83i/ds.jpg?raw=1)

### Tuples

**IMMUTABLE** collection of elements

Defined using parenthesis and separating elements with commas

Not all elements in a tuple have to be of the same type

In [31]:
tuple1 = (1,2,3)
tuple1

(1, 2, 3)

In [32]:
tuple2 = (1,'g',4.4,True)
tuple2

(1, 'g', 4.4, True)

### Lists

**MUTABLE** collection of elements

Defined using squared brackets and separating elements with commas

Not all elements in a list have to be of the same type

In [33]:
list1 = [1,2,3]
list1

[1, 2, 3]

In [34]:
list2 = [1,'g',4.4,True]
list2

[1, 'g', 4.4, True]

### The `len()` function

In [35]:
len(tuple1)

3

In [36]:
len(list2)

4

### Accessing an element in a tuple/list

We can access to all positions in a tuple/list by using squared brackets **after** the tuple/list

**Indexes in Python begin in 0!**

In [37]:
print(list1)
# Access the first element of list1
list1[0]

[1, 2, 3]


1

In [38]:
print(tuple2)
# Access the second element of tuple2
tuple2[1]

(1, 'g', 4.4, True)


'g'

Why do we need two very similar structures such as tuples and lists? 

# Coursework Clarification

# LAB: Getting Familiar with Python & Jupyter Notebook

In case you weren't here last week or you still have issues with Python and/or Jupyter Notebook, we will solve your queries

Moreover, you can test if all extensions are working, such as Rise, Spellcheck and others

Finally, if you want to practice I will enable some courses in Datacamp