# Intro to Data Science



---
<img src="https://calnerds.berkeley.edu/css/images/logo.jpg"  /> <!--style="width: 500px; height: 275px;"-->




### Table of Contents

1 - [Basics](#section1)<br>

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 1.1 - [Python for simple arithmetic](#subsection1)<br>

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 1.2 - [Variables](#subsection2)<br>

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 1.3 - [Python for textual data ](#subsection3)<br>

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 1.4 - [Lists](#subsection4)<br>

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 1.5 - [Arrays](#subsection5)<br>











---
## Basics  <a id='section1'></a>

* Data Types or Object: You can group data into two main types, __numerical__ and __textual__. In this section you will learn about the two different types of numerical information that it is there, namely, float and integer data.  



### 1.1 Python can be use to compute simple __arithmetic__  <a id='subsection1'></a>


In [5]:
# EXAMPLE

2 + 3

5

In [6]:
2.0 + 3

5.0

In [7]:
4 / 2


2.0

In [None]:
5 * 3


In [None]:
5 ** 3

In [None]:
(1 + 2) * 3

### 1.2 Variables <a id='subsection2'></a>

We should think of a variable as a container or a storage box that allows us to save various types of information/objects we plan to use later. Defining a variable consists of two components: a name it will be called by and information/object you store under that name. One variable can hold one type of information. 

* We assign an object using **=** 



In [None]:
# EXAMPLE

var = "Variable"
var

In [None]:
# EXERCISE

hello  = "..."
hello

### 1.3 Python can be use on textual data <a id='subsection3'></a>

* Textual data are called strings 
* Strings are defined by the quotation marks (either "double"  or 'single').





In [None]:
# EXAMPLE

"Kseniya Usovich"

In [None]:
# EXAMPLE

'Kseniya Usovich'

Be careful, because some of the textual information might already contain the single (more common) or double (less common) quotation marks inside. Like so:

In [None]:
# EXAMPLE

'Joe's'


If we were to use single quotation marks on the outside, it would give us an error like the one above. To fix this, we want to use double quotations outside, or simply put a backslash before the quotation mark, so the computer will read it as part of the string instead of a part of the code itself.

In [None]:
# EXAMPLE

"Joe's"

In [None]:
# EXAMPLE

'Joe\'s'

Everything inside that quotation marks will be read as a textual information. Compare the outputs in the cell below.

In [15]:
# EXAMPLE

print("2+3")
print(2+3)

2+3
5


---
### 1.4 Lists <a id='subsection4'></a>

List is a data structure that allows you to save multiple objects.


* NOTE: DON'T EVER CALL YOUR LIST "list". This is a reserved name for a function in python. BUT, you can used this function to create new lists! 

In [None]:
# here's how you can create an empty list
# you'll learn where to use it later in the workshop

empty_lst = []
empty_lst

In comparison with strings, lists can contain different data types/ objects.

In [None]:
# EXAMPLE

new_lst = [2, "name", 3, "berkeley"]
new_lst

Make sure you don't use the built-in functions as names for your variables. It is easy to tell (in Jupyter Notebooks) whether something is a function/method. When you enter it, it be in green in your code cell.

In [None]:
list

Let's see how we can use this function on a string.

In [None]:
# EXAMPLE

new_name = list(name)
new_name

### 1.4.1 List Iteration and Slicing `[ : ]`

To iterate through the list, you can use square brackets with an index of an object you are interested in. Like so:

In [None]:
# remember that Python, like many programming languages, starts counting from 0.

new_lst[1]

Notice also that each component of the list separated by a comma is a single object. So the list inside of the list will be a single object. If you want to iterate through a list inside of the list, you will need to use the square brackets twice. Like we did here:

In [None]:
# EXAMPLE

new_lst[3][0]

You can also replace the objects in the list by using iteration.

In [None]:
# EXAMPLE

new_lst[0] = 6
new_lst

### 1.4.2 Adding Elements

To add elements to a list, we can use a few fucntions. The ones we will show you today are built-in functions **insert** and **append**.

In [None]:
# EXAMPLE

trees = ["Sequoia", "Palm Tree", "Joshua Tree"]

# to add an element to the end of the list, we can use "append"

trees.append("Pine Tree")
trees

We use **insert** when we want to put something at a certain position. But be careful with it, you actually need to specify the position at which you are inserting an element.

In [None]:
# EXAMPLE

trees.insert(2, "Redwood")
trees

In the cell below, try adding one tree of your liking to the position 1 and then another tree at the end of the list.

### 1.4.3 Deleting Elements

Python has 3 built-in functions and methods for deleting elements from a list.

In [1]:
random_list = ['Zoom', 1, 7, 9, "Python", "Berkeley"]

#### Method **del** works with indeces and allows for the deletion of parts of the list through slicing.

In [None]:
# EXAMPLE 

del random_list[0:2]

random_list

Method **.pop(  )** works with indeces. It is the last symbol/character/object by default. But you can add a specific index.

In [None]:
# EXAMPLE 

random_list.pop()

In [None]:
# EXAMPLE 

random_list.pop(0)

Notice that **.pop(  )** also outputs the thing it has removed from a list. 

In [None]:
# EXAMPLE 

state = ["California", "is", "on the", "West Coast"]
last = state.pop()

print(state)
print(last)

Another method you can use for lists is **.remove**. This method uses the exact values (case matters too).

In [None]:
# EXAMPLE 

random_list.remove("Python")

In [None]:
random_list

Now you try removing any three elements from this list three different ways.

### 1.4.4 Length of a list

Built-in function **len(   )** can also be used with lists.

In [None]:
# EXAMPLE 
# Before running this cell, think what the output will be.

countries = ["USA", "Belarus", "Mexico", "Poland"]
len(countries)

### 1.4.5 Operations on a list

In [40]:
my_numbers = [1, 2, 3, 4, 5]

#### sum by 2 

In [11]:
my_numbers + 2

TypeError: can only concatenate list (not "int") to list

Lists do not support "+" as a addition, rather the "+" operator actually concatenates. In order to concatenate, or glue elements together, they must the same data type or data structure. 


In [39]:
list_one = [1, 2, 3]
list_two = [4, 5, 6]
list_three = list_one + list_two
list_three

[1, 2, 3, 4, 5, 6]

#### multiply by 2

In [13]:
my_numbers * 2

[1, 2, 3, 4, 5, 1, 2, 3, 4, 5]

---
### 1.5 Arrays <a id='subsection5'></a>

Arrays are another data structure that is similar to lists in a way, but __they only allow for one type of information to be saved in them__. Where a list can have multiple data types in it, each array can only have objects of one type (either numerals, or strings, or lists, etc.). You also have operations that act on each element of a vector.

In [26]:
# EXAMPLE

# here we create an array with values from 0 to 6

my_array = np.array([0, 1, 2, 3, 4, 5, 6])


#### Add by 3 

* adds 3 to each element in the array 

In [31]:
my_array + 3

array([3, 4, 8, 6, 7, 8, 9])

#### Multiply by 3

* multiplies by 3 each element in the array 

In [42]:
# EXAMPLE 

my_array * 3


array([ 0,  3, 15,  9, 12, 15, 18])

#### Operations can also work with more than one array. 


In [37]:
array_one = np.array([1, 2, 3])
array_two =  np.array([4, 5, 6])

In [38]:
array_three  = array_one + array_two 
array_three

array([5, 7, 9])

#### Change Values

In [None]:
# EXAMPLE 

my_array[2] = 5
my_array

Arrays are most commonly used with tables (Data Frames), which we will work with later.

---
Notebook developed by: Kseniya Usovich & Karla Palos

Cal NERDS GitHub: https://github.com/Cal-NERDS
