# DSC200 - Lecture 2
## Python Basics - part 1


## The Python language

Python is one of the most popular programming languages in world and broadly used in multiple domains: 
 - data science (obviously!)
 - machine learning
 - web development
 - analytics
 - algorithms
 


 It is designed to have a simple syntax: no semicolons, no curly braces to denote scopes, etc. It also has:
 - dynamic typing: no need to specify datatype of a variable
 - a wide variety of existing packages: no need to re-invent the wheels

 Personally, Python is my go-to language for most of my projects. It is easy to learn and use, can be just as fast as other languages, and has a large and active ecosystem.

In this course we will primarily be using the Jupyter Notebooks to write and run Python code.

These versatile notebooks allow you to write and run code in cells, and also include text and images to document your code.

In fact, this lecture is written in a Jupyter Notebook! Assignments will also be submitted in this format.

Let's take a look at the DataHub interface and how to run the right instance with all the necessary packages.

Notebooks are fantastic for learning and teaching, as they allow you to write code and see the output immediately.

For more intensive analysis code, you may want to use a Python script, which can be run from the command line or in an IDE like PyCharm. We will cover this in a later lecture.

For now, I encourage you to follow along in your own Jupyter Notebook, and try out the code as we go along.

## Hello world

The first thing you learn in any programming language is how to print "Hello, World!" to the screen. Here's how you do it in Python:

In [1]:
print("Hello, World!")

Hello, World!


Note, in Python 2, you could use `print "Hello, World!"` but in Python 3, you would use `print("Hello, World!")`. We will be using Python 3 in this course. 

Also, the variable doesn't need to be a string, it can be any data type.

## Comments

Comments are an important way to make your code more readable and clear. Any lines that start with a `#` are ignored by Python. 

In [2]:
# This is a comment
print("Hello, World!") # This is also a comment


Hello, World!


For longer comments and documentation, you can use triple quotes:

In [3]:
"""
This is a longer comment or documentation string.
It can span multiple lines. You can use it to describe the purpose of a function or a block of code.
"""
print("Hello, World!")

Hello, World!


## Variables

Variables are used to store data that can be referenced and manipulated in a program. In Python, you don't need to declare the type of a variable, it is automatically inferred.

You can assign a value to a variable using the `=` operator:


In [4]:
x = 5
y = "Hello, World!"

### Quick quiz

What is the value of `y` after the following code is run?

*sli.do ID: 4190516*

In [5]:
x = 5
y = 10
y = x


## Variables can be in a variety of types

Variables can be of different types, such as integers, floats, strings, and booleans.

Python offers a variety of built-in data types:

- `int` for integers
- `float` for floating-point numbers
- `str` for strings
- `bool` for boolean values (True or False)



You can check the type of a variable using the `type()` function:

In [6]:
x = 5
y = 5.0
z = "Hello, World!"

print(type(x))
print(type(y))
print(type(z))

<class 'int'>
<class 'float'>
<class 'str'>


## Strings

Strings are used to store text data. You can create a string by enclosing text in single or double quotes as we saw earlier. 

They have a number of useful methods and operations that can be used to manipulate and format them:

In [7]:
s = "hello, world!"
print(len(s)) # 13
print(s.upper()) # HELLO, WORLD!
print(s.capitalize()) # Hello, world!
print(s.replace("world", "DSC200")) # hello, DSC200!
print(s.split(",")) # ['hello', ' world!']

13
HELLO, WORLD!
Hello, world!
hello, DSC200!
['hello', ' world!']


You can also concatenate strings using the `+` operator:

In [8]:
s1 = "hello"
s2 = "world"
print(s1 + " " + s2) # hello world

hello world


## Indexing and slicing

Strings can be indexed and sliced using square brackets `[]`. Indexing starts at **0** in Python!

In [9]:
s = "hello, world!"
# indexing specific characters
print(s[0]) # h
print(s[1]) # e


h
e


In [10]:
# slicing
print(s[0:5]) # hello

hello


In [11]:
# implicit start and end
print(s[7:]) # world!


world!


In [12]:
# negative indexing
print(s[-1]) # !
print(s[-6:]) # world!


!
world!


In [13]:
# step
print(s[::2]) # hlo ol!
print(s[::-1]) # !dlrow ,olleh

hlo ol!
!dlrow ,olleh


Question 1: Suppose we want to slice a string that is stored in greetings so that sliced only has “you”.  Fill in the blank.

In [14]:
greetings = "Hello, how are you?"
sliced = greetings[:]
print(sliced)

Hello, how are you?


Question 2: Given `greetings` defined above, fill in the blank below to find the index where the word “are” starts. 

## Other containers

Python offers multiple sequence/map/set types to store multiple items in a single variable:

- **sequence types**: stores multiple items in a single variable
  - list (`[]`), tuple (`()`)

- **set types**: stores multiple distinctive items in a single variable
    - set (`{}`)

- **map types**: stores multiple pairs of key-values in a single variable; and a key is mapped to a value
    - dict (`{<key> : <value>, ...}`)


## Lists

Lists can be created in multiple ways:

In [15]:
# Declaring an empty list
my_list = []
# or
my_list = list()

In [16]:
# Declaring a list with values
my_list = [1, 2, 3, 4, 5]

In [17]:
# Using an existing list
new_list = list(my_list)


In [18]:
# Using a list comprehension
new_list = [x for x in range(10)]

## Quick quiz

Question 1: What is the value of `my_list` after the following code is run?


In [19]:
my_list = [1, "data", 1.4, [13, 15, 1]]
my_list[-2] 

1.4

Question 2: How do I extract the value 15 from the list `my_list`?

Question 2b: How do I extract the value 15 from the list `my_list` using negative indexing?

Question 3: What is the output of the following slice operation?

In [20]:
short_list = my_list[1:3]


Question 4: What is the output of the following operation?


In [21]:
magic = my_list[2:3] + short_list

Question 5: What is the output of the following operation?


In [22]:
magic[0] = 0
magic

[0, 'data', 1.4]

## Tuples

Tuples are similar to lists, but they are immutable, meaning they cannot be changed after they are created.

Tuples can be created in multiple ways:

 - Declaring an empty tuple: `my_tuple = ()` or `my_tuple = tuple()`
 - Initializing a tuple with some values: `my_tuple = (1, 2, 3)`
 - Using an existing tuple: `my_tuple = tuple(another_tuple)`
 - Declaring a tuple with a range: `my_tuple = tuple(range(5))`
 - Using tuple comprehension: `my_tuple = (x for x in range(5))`


### Quick quiz

Question 1: What is the value of `my_tuple` after the following code is run?


In [23]:
my_tuple = (1, "data", 1.4, [13, 15, 1])
my_tuple[-2]

1.4

Question 2: What is the output of the following operation?


In [24]:
(number, text, float_number, inner_list) = my_tuple
float_number


1.4

Question 3: What is the output of the following operation?

In [25]:
# my_tuple[1] = "dsc200"
# my_tuple

## Mutability

Every variable in Python is an object. An object consists of two things:

- reference: indicates where (i.e. memory address) you can find the actual value
- value: the actual value you assign to an object

`id()` in Python shows the reference information of an object:

In [26]:
x = 5
print(id(x))


4311542192


Immutable objects cannot change the value once defined. This means that if you change the value of an immutable object, you are actually creating a new object and assigning it to the same reference box.

Mutable objects can change the value once defined. This means that if you change the value of a mutable object, you are actually changing the value of the same object.

Let's see an example:

In [27]:
x = 5
print(id(x))
x = 6
print(id(x))

4311542192
4311542224


Question: So is an integer mutable or immutable?

How about a list?

In [28]:
x = [1, 2, 3]
print(id(x))
x.append(4)
print(id(x))

4375439104
4375439104


In Python, integers, floats, strings, and tuples are immutable, while lists, sets, and dictionaries are mutable.