# Introduction to Data Science 


# Lab 1: Introduction to Python


**British University in Egypt**<br>
**Instructor:** Nahla Barakat



---


## Table of Contents 
<ol start="0">
<li> Learning Goals </li>
<li> Getting Started</li>
<li> Lists </li>
</ol>

## Part 0:  Learning Goals 
This introductory lab is an introductory tutorial in Python programming.  By the end of this lab, you will feel more comfortable:

- Writing short Python code using functions, loops, arrays, dictionaries, strings,  if statements.

- Manipulating Python lists and recognizing the listy properties of other Python containers.

- Learning and reading Python documentation.  

## Part 1: Getting Started

### Importing modules
All notebooks should begin with code that imports *modules*, collections of built-in, commonly-used Python functions.  Below we import the Numpy module, a fast numerical programming library for scientific computing.  Future labs will require additional modules, which we'll import with the same `import MODULE_NAME as MODULE_NICKNAME` syntax.

In [2]:
import numpy as np #imports a fast numericl programming library

Now that Numpy has been imported, we can access some useful functions.  For example, we can use `mean` to calculate the mean of a set of numbers.

In [3]:
np.mean([1.2, 2, 3.3])

2.1666666666666665

to calculate the mean of 1.2, 2, and 3.3.

The code above is not particularly efficient, and efficiency will be important for you when dealing with large data sets. In Lab 1 we will see more efficient options.

### Calculations and variables

At the most basic level we can use Python as a simple calculator.

In [4]:
1 + 2

3

Notice integer division (//) and floating-point error below!

In [5]:
1/2, 1//2, 1.0/2.0, 3*3.2

(0.5, 0, 0.5, 9.600000000000001)

The last line in a cell is returned as the output value, as above.  For cells with multiple lines of results, we can display results using ``print``, as can be seen below.

In [6]:
print(1 + 3.0, "\n", 9, 7)
5/3

4.0 
 9 7


1.6666666666666667

We can store integer or floating point values as variables.  The other basic Python data types -- booleans, strings, lists -- can also be stored as variables. 

In [7]:
a = 1
b = 2.0

Here is the storing of a list:

In [8]:
a = [1, 2, 3]

Think of a variable as a label for a value, not a box in which you put the value

![](images/sticksnotboxes.png)

(image taken from Fluent Python by Luciano Ramalho)

In [9]:
b = a
b

[1, 2, 3]

This DOES NOT create a new copy of `a`. It merely puts a new label on the memory at a, as can be seen by the following code:

In [10]:
print("a", a)
print("b", b)
a[1] = 7
print("a after change", a)
print("b after change", b)

a [1, 2, 3]
b [1, 2, 3]
a after change [1, 7, 3]
b after change [1, 7, 3]


Multiple items on one line in the interface are returned as a *tuple*, an immutable sequence of Python objects.

In [11]:
a = 1
b = 2.0
a + a, a - b, b * b, 10*a

(2, -1.0, 4.0, 10)

We can obtain the type of a variable, and use boolean comparisons to test these types. 

In [12]:
type(a) == float

False

In [13]:
type(a) == int

True

> **EXERCISE**:  Create a tuple called `tup` with the following seven objects:

> - The first element is an integer of your choice
> - The second element is a float of your choice  
> - The third element is the sum of the first two elements
> - The fourth element is the difference of the first two elements
> - The fifth element is first element divided by the second element

> Display the output of `tup`.  What is the type of the variable `tup`? What happens if you try and chage an item in the tuple? 

In [14]:
# your code here
a = 5
b = 2.5
tup= a,b,a+b,a-b,a/b
tup

(5, 2.5, 7.5, 2.5, 2.0)

[Take this form when you are done](https://forms.office.com/Pages/ResponsePage.aspx?id=Bm7bI8QFnUixNsupSo5vNnDd1nrjesdCq04vgwaV475UOFZTVTFLTUQ3WEdZSDVHVENJQVZWUUNQWS4u)

## Part 2: Lists

Much of Python is based on the notion of a list.  In Python, a list is a sequence of items separated by commas, all within square brackets.  The items can be integers, floating points, or another type.  Unlike in C arrays, items in a Python list can be different types, so Python lists are more versatile than traditional arrays in C or in other languages. 

Let's start out by creating a few lists.  

In [15]:
empty_list = []
float_list = [1., 3., 5., 4., 2.]
int_list = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
mixed_list = [1, 2., 3, 4., 5]
print(empty_list)
print(int_list)
print(mixed_list, float_list)

[]
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
[1, 2.0, 3, 4.0, 5] [1.0, 3.0, 5.0, 4.0, 2.0]


Lists in Python are zero-indexed, as in C.  The first entry of the list has index 0, the second has index 1, and so on.

In [16]:
print(int_list[0])
print(float_list[1])

1
3.0


What happens if we try to use an index that doesn't exist for that list?  Python will complain!

In [17]:
print(float_list[1])

3.0


A list has a length at any given point in the execution of the code, which we can find using the `len` function.

In [18]:
print(float_list)
len(float_list)

[1.0, 3.0, 5.0, 4.0, 2.0]


5

### Indexing on lists

And since Python is zero-indexed, the last element of `float_list` is

In [19]:
float_list[len(float_list)-1]

2.0

It is more idiomatic in python to use -1 for the last element, -2 for the second last, and so on

In [20]:
float_list[-1]

2.0

We can use the ``:`` operator to access a subset of the list.  This is called *slicing.* 

In [21]:
print(float_list[1:5])
print(float_list[0:2])

[3.0, 5.0, 4.0, 2.0]
[1.0, 3.0]


Below is a summary of list slicing operations:

<img src="file://localhost/Ashraf/Introduction to Data Science/Formal Lectures/Ain Shams Labs/Lab1/ops3_v2.png" alt="Drawing" style="width: 600px;"/>

You can slice "backwards" as well:

In [22]:
float_list[:-2] # up to second last

[1.0, 3.0, 5.0]

In [23]:
float_list[:4] # up to but not including 5th element

[1.0, 3.0, 5.0, 4.0]

You can also slice with a stride:

In [24]:
float_list[:4:2] # above but skipping every second element

[1.0, 5.0]

We can iterate through a list using a loop.  Here's a for loop.

In [25]:
for ele in float_list:
    print(ele)

1.0
3.0
5.0
4.0
2.0


Or, if we like, we can iterate through a list using the indices using a for loop with  `in range`. This is not idiomatic and is not recommended, but accomplishes the same thing as above.

In [26]:
for i in range(len(float_list)):
    print(float_list[i])

1.0
3.0
5.0
4.0
2.0


What if you wanted the index as well?

Python has other useful functions such as `enumerate`,  which can be used to create a list of tuples with each tuple of the form `(index, value)`. 

In [27]:
for i, ele in enumerate(float_list):
    print(i,ele)

0 1.0
1 3.0
2 5.0
3 4.0
4 2.0


In [28]:
list(enumerate(float_list))

[(0, 1.0), (1, 3.0), (2, 5.0), (3, 4.0), (4, 2.0)]

This is an example of an *iterator*, something that can be used to set up an iteration. When you call `enumerate`, a list if tuples is not created. Rather an object is created, which when iterated over (or when the `list` function is called using it as an argument), acts like you are in a loop, outputting one tuple at a time.

### Appending and deleting

We can also append items to the end of the list using the `+` operator or with `append`.

In [29]:
float_list + [.333]

[1.0, 3.0, 5.0, 4.0, 2.0, 0.333]

In [30]:
float_list.append(.444)

In [31]:
print(float_list)
len(float_list)

[1.0, 3.0, 5.0, 4.0, 2.0, 0.444]


6

Go and run the cell with `float_list.append` a second time.  Then run the next line.  What happens?  

To remove an item from the list, use `del.`

In [32]:
del(float_list[2])
print(float_list)

[1.0, 3.0, 4.0, 2.0, 0.444]


### List Comprehensions

Lists can be constructed in a compact way using a *list comprehension*.  Here's a simple example.

In [33]:
squaredlist = [i*i for i in int_list]
squaredlist

[1, 4, 9, 16, 25, 36, 49, 64, 81, 100]

And here's a more complicated one, requiring a conditional.

In [34]:
comp_list1 = [2*i for i in squaredlist if i % 2 == 0]
print(comp_list1)

[8, 32, 72, 128, 200]


This is entirely equivalent to creating `comp_list1` using a loop with a conditional, as below:

In [35]:
comp_list2 = []
for i in squaredlist:
    if i % 2 == 0:
        comp_list2.append(2*i)
        
comp_list2

[8, 32, 72, 128, 200]

The list comprehension syntax

```
[expression for item in list if conditional]

```

is equivalent to the syntax

```
for item in list:
    if conditional:
        expression
```

>**EXERCISE**:  Build a list that contains every prime number between 1 and 100, in two different ways:
1.  Using for loops and conditional if statements.
2.  *(Stretch Goal)* Using a list comprehension.  You should be able to do this in one line of code, and it may be helpful to look up the function `all` in the documentation.

In [36]:
# your code here
PrimeList=[]
for i in range(2,100):
    isPrime=True
    
    for x in range(2,i):
        if i%x==0:
            isPrime=False
            break
            
    if isPrime:
        PrimeList.append(i)

print(PrimeList)
        

[2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97]


In [37]:
# your code here
PrimeList=[i for i in range(2,100) if all(i%x!=0 for x in range(2,i))]
print (PrimeList)

[2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97]
