### MY470 Computer Programming
# Data Types in Python
### Week 2 Lecture

## Overview

* About Python
* Scalars: `int`, `float`, `bool`, `None`
    * Operators: arithmetic, boolean, comparison, assignment, membership
* Non-scalars: `list`, `tuple`, `str`, `set`, `dict`
    * Methods
    * Ordered vs. unordered non-scalars       
    * Mutable vs. immutable non-scalars


![Python on xkcd](figs/xkcd.png "Python on xkcd")

   Source: http://xkcd.com/353/

## Why Python?

![Python](figs/python.png "Python")

* Open-source – free and well-documented
* Simple and concise syntax
* Many useful libraries
* Cross-platform
* [Widely used in industry and science](https://youtu.be/cKzP61Gjf00)

## Python vs. Java: Syntax

* Python

In [4]:
print('Hello world!')

Hello world!


* Java

```public class HelloWorld
{
    public static void main (String[] args)
    {
        System.out.println("Hello world!");
    }
}```

## Python vs. C, Matlab, R, and Julia: Speed

| Task     | Python |    C   | Matlab  |    R   | Julia   
| :------- |:------:|:------:| :------:|:------:|:--------:
| Loops    | 61.97  |   0.55 |    6.80 | 744.93 |  0.34
| Matrix multiplication | 0.95 |   -    |    0.90 | 11.46  |  1.09
| Open files and plot data  | 1399 |   - |    1678 |   2220 |  1317
| Metropolis-Hastings algorithm | 0.08 | 4.30 | 0.99 | 28.63 |  0.73


Source: https://modelingguru.nasa.gov/docs/DOC-2625


## A Brief History of Python

![Guido van Rossum](figs/van_rossum_2006.jpg "Guido van Rossum")

* Started in December 1989 by Guido van Rossum, BDFL (Benevolent Dictator for Life)
* Python 2.0 released in 2000
* Python 3.0, which is backward-incompatible, released in 2008
* End of Life date for Python 2.7 was January 1st, 2020

## From Last Week: Objects, Data Types, and Expressions

* Computer programs manipulate objects
* Objects have types
  * Scalar — indivisible (atomic/one value at a time/represent only one single data value)
  * Non-scalar — with internal structure (collections of multiple data values)
* Expressions combine objects and operators

## Scalar Data Types

* Integer
* Float
* Boolean
* NoneType
* (String is non-scalar in Python)

In [4]:
# type () 하면 returns the type of the object or returns the new type object based on the arguments passed 
print(type(2))
print(type(1.125))
print(type(True))
print(type(None))
print(type('a'))



<class 'int'>
<class 'float'>
<class 'bool'>
<class 'NoneType'>
<class 'str'>


## Converting between Scalar Data Types

* Use the name of a type to convert values to that type

In [29]:
a = float(123)
b = int('32') #왜 인티저에 쿼테이션 마크 잇는거? bc we want to convert data type from str to int
print(a, b)
c = int() # if no value, it returns zero
d = float(234)
e = int(25)
print (d,e)
print (e)
print (type ('32')) #'32'는 str
print (type (int('32'))) #int('32')는 int로 변환되는것을 볼 수 있음


123.0 32
234.0 25
25
<class 'str'>
<class 'int'>


## Operators


* Arithmetic
* Boolean
* Comparison
* Assignment

## Arithmetic Operators

* `+` &nbsp;&nbsp; addition
* `-` &nbsp;&nbsp; subtraction
* `*` &nbsp;&nbsp; multiplication
* `/` &nbsp;&nbsp; division
* `%` &nbsp;&nbsp; modulus 나머지
* `//` &nbsp;&nbsp; floor division 몫
* `**` &nbsp;&nbsp; exponent

In [31]:
# + and * have different meanings depending on the types of objects with which they are used
print(2 + 2)
print('a' + 'bc')
print(3 * 2)
print(3 * 'a' + 'h!')
print(2+2)
print('a'+'bc')
print (45%7)
print (39//2)

4
abc
6
aaah!
4
abc
3
19


## Boolean Operators

* `and`
* `or`
* `not`

In [38]:
print(True and False)
print (True and True)
print (False and True)
print (False and False) ## and 에서는 false가 하나라도 있으면 false
print (True or True)
print(True or False)
print (False or True)
print (False or False) ## or 에서는 true 가 하나라도 잇으면 true 
print(not False)
print(not True)

False
True
False
False
True
True
True
False
True
False


## Comparison Operators

* `==` &nbsp;&nbsp; equals
* `!=` &nbsp;&nbsp; does not equal
* `>` &nbsp;&nbsp; is greater than
* `<=` &nbsp;&nbsp; is less than or equal, etc. #항상 <>!가 =선행해야함

## Assignment Operators

* `=` &nbsp;&nbsp; assign right operand to left operand
* `+=` &nbsp;&nbsp; add right operand to left operand and assign to left operand
* `-=` &nbsp;&nbsp; subtract right operand from left operand and assign to left operand, etc.

In [39]:
a = 2 # assignment이다 R이랑 헷갈리면 안댐
a += 3 # Equivalent to a = a + 3, a에 3을 더하라
print(a)
b = 4
b -= 7
print (b)

5
-3


## Comparison vs. Assignment Operators


In [43]:
a = 2 # This is assignment
print(a == 1) # This is test for equality. It returns bool.
b = 5
print(b >=7 or b <=6)

False
True


## Non-Scalar Data Types

* List – a mutable ordered sequence of values
* Tuple – an immutable ordered sequence of values
* String – an immutable ordered sequence of characters
* Set – a mutable unordered collection of unique values
* Dictionary – a set of key/value pairs

In [75]:
list_var = [1, 2, 2, 'a', 'a'] # list []
tuple_var = (1, 2, 'a', 'b') # tuple()
set_var = {1, 2, 2, 'a', 'b'} # set {}
dict_var = {1: 'a', 2: 'b', 3: 'c'} # dictionary{}
print(list_var, set_var, tuple_var)
list_var2 = [2, 1, 3, 'b','b']
tuple_var2 = (7,1,'b','z', 'a')
set_var2 = {3,5,'b',4,'b',3,'a', 4,4,4,7} #set order so they do not preserve the order in which they were created
dict_var2 = {1: '3', 2: 'g', 3: '3', 4:'b'}
print(type(dict_var2[1])) # str임을 알 수 있음 즉 key1에 해당하는 밸류가 int가 아니라 스트링
print(dict_var2[4]) #앞의 key를 입력해주면 그에 해당하는 value 값 도출
print(list_var2, tuple_var2, set_var2, dict_var2)

[1, 2, 2, 'a', 'a'] {1, 2, 'b', 'a'} (1, 2, 'a', 'b')
<class 'str'>
b
[2, 1, 3, 'b', 'b'] (7, 1, 'b', 'z', 'a') {3, 4, 5, 7, 'a', 'b'} {1: '3', 2: 'g', 3: '3', 4: 'b'}


## Converting between Non-Scalar Data Types

* Use the name of a type to convert values to that type

In [76]:
tup = tuple([1, 2, 3]) #tuple 은 ()이니까 튜플안에 list넣은듯 
tup_2 = tuple([3, 7, 4])
print(type(tup))
dic = dict( [(1, 'd'), (2, 'b'), (3, 'c')] ) #내생각에 dict ()이라는 것은 dict으로 안에 내용을 바꿔준다는것 이때 dict ([(key1, value1), (key2, value2), (key, value3)])
print(type(dic))
dic_2 = dict( [(1, 'v'), (2, 'g'), (3, 'k')] )

print(tup, dic)
print(tup_2, dic_2)

<class 'tuple'>
<class 'dict'>
(1, 2, 3) {1: 'd', 2: 'b', 3: 'c'}
(3, 7, 4) {1: 'v', 2: 'g', 3: 'k'}


## Membership Operator

* `in` &nbsp;&nbsp; left element is in right non-scalar
  

In [77]:
print('x' not in 'abcdefg')
print('y' not in 'syyyy')

True
False


## Length of Non-Scalar Objects

* The `len()` function returns the length of the element # atomic이 아닌 애들의 길이 알아보는법


In [78]:
print( len( [0, 1, 2] ) ) #list
print( len('ab') ) #str in python is non-scalar
print( len( (1, 2, 3, 4, 'a') ) )
print( len( {1: 'a', 2: 'b'} ) )
print( len( [6, 7, 2, 1, 31]) ) #list 원소 6,7,2,1,31 길이
print( len( ('t', 6, 1, 13, 4) ) )#tuple 원소 t,6,1,13,4
print(len( {3, 7, '3', '7', '4', 4, 4} ) ) #set 원소 3,7,'3','7','4',4,4하면 중복되는건 하나로 침
print(len( {3: '3', 2: '5', 4: '7', 1: '3'} ) )#dict 원소 3:'3', 2:'5', 4:'7', 1:'3' key-value 쌍이 몇개인지 
print(len('sdfdflkajf384y2432'))


3
2
5
2
5
5
6
4
18


## Non-Scalar Data Types: Exercise

In [96]:
# Use len() to count the number of unique letters in the string below

s = 'jackie will budget for the most expensive zoology equipment'



#set(s)하면 s에 대한 set이 만들어짐 list(s)하면 리스트 만들어지고 tuple(s)하면 tuple만들어짐 
#set(s)가 빈칸 하나를 포함하니까 -1하면 됨
len(set(s))-1


26

## Strings

* You can write string literals in different ways
  * Single quotes: `'allows embedded "double" quotes'`
  * Double quotes: `"allows embedded 'single' quotes"` #안에 '제발' 이것처럼 싱글 쿠옷 써야할때
  * Triple quoted: `'''Three single quotes'''`, `"""Three double quotes"""`


In [103]:
'''Triple quoted strings may span multiple lines - 
all associated whitespace will be included 
in the string literal.'''


'Triple quoted strings may span multiple lines - \nall associated whitespace will be included \nin the string literal.'

* Strings implement all of the common sequence operations we will shortly discuss, along with some additional methods: http://docs.python.org/3/library/stdtypes.html#string-methods

## Objects Have Methods Associated with Them

### `object.method()`

Use the period `.` to link the method to the object.

In [108]:
string1 = 'Hello'

string1 + '!'   # This is an operator. Operators combine objects in expressions.
len(string1)   # This is a function. Functions take objects as arguments.
string1.upper()   # This is a method. Methods are attached to objects.


'HELLO'

## String Methods: Formatting

* `S.upper()` – change to upper case
* `S.lower()` – change to lower case
* `S.capitalize()` – capitalize the first word
* `S.find(S1)` – return the index of the first instance of input

In [114]:
print('Make me scream!'.upper())
x = 'make this into a proper sentence' #변수지정 해놓고 x.method하는거
print(x.capitalize() + '.')

print('Find the first "i" in this sentence.'.find('i')) # i가 두번째로 오니까 결과값은 1
print('what the fuck'.upper())
x = "i don't like coding"
print(x.capitalize() + '!')
print('find the first z in this sentence.'.find('p')) #왜 없는값 치면 -1나오는거??

MAKE ME SCREAM!
Make this into a proper sentence.
1
WHAT THE FUCK
I don't like coding!
-1


## String Methods: `strip` and `replace`

* `S.replace(S1, S2)` – find all instances of S1 and change to S2
* `S.strip(S1)` – remove whitespace characters from the beginning and end of a string (useful when reading in from a file)

In [141]:
x = ' This is a long sentence that we will use as an example.\n' #\n하고 닫으면 줄 한칸 생김 여러개 생기게 하려면 \n\n\n\n strip methods remove whitespace

print(x.replace('s', 'S'))
print(x.strip())
print(x.replace(' ', ''))

y = ' This is a sentence created by mini.\n\n\n'
print(y.replace('s', 'S')) #이건 이대로 이미 커맨드 된거라 뒤에 \n\n\n유효함
print(y.strip()) #문장 첫 머리랑 뒤에 whitespace다 없애줘서 뒷 문장 첫 str이 다음줄에 공백없이 바로 옴
print(y.replace(" ", "")) #여긴 또 새로운 코맨드라 \n\n\n유효
print('sdf')



 ThiS iS a long Sentence that we will uSe aS an example.

This is a long sentence that we will use as an example.
Thisisalongsentencethatwewilluseasanexample.

 ThiS iS a Sentence created by mini.



This is a sentence created by mini.
Thisisasentencecreatedbymini.



sdf


## String Methods: `split` and `join`

* `S.split(S1)` – split the string into a list
* `S.join(L)` – combine the input sequence into a single string

In [154]:
x = 'this is a collection of words i would like to break it into tokens'
y = x.split()    # default is to split on ' ' split()괄호안에 없애고 싶은 str 잇으면 그거 'str'넣으면 됨
z = x.split('k')
print(y)
print(z)
print(x.split('o'))  #'o'를 중심으로 split하세요 , 여기도 마찬가지로 o를 없애고 

x_new = '-'.join(y)
print(x_new)

['this', 'is', 'a', 'collection', 'of', 'words', 'i', 'would', 'like', 'to', 'break', 'it', 'into', 'tokens']
['this is a collection of words i would li', 'e to brea', ' it into to', 'ens']
['this is a c', 'llecti', 'n ', 'f w', 'rds i w', 'uld like t', ' break it int', ' t', 'kens']
this-is-a-collection-of-words-i-would-like-to-break-it-into-tokens


## String Methods: Exercise

In [186]:
# Use string methods to create a properly formatted sentence 
# from the words below:

ls = ['  ThiS', 'SenTence', 'NEEDS', 'bEttEr!', 'foRmatting'] 
ls_new = ' '.join(ls).lower().replace('!', '').strip().capitalize() +'.'
#그냥 이렇게 줄줄 쓰면 됨 괄호괄호 할 필요 없음

print (ls_new)



This sentence needs better formatting.


## Unordered Types vs. Sequences

* Unordered types: `set`, `dict`
* Ordered (sequence) types: `str`, `list`, `tuple`
  

In [194]:
st = {1, 2, 2, 'a', 'b'} # sets are unordered
print(st)

{1, 2, 'b', 'a'}


## Set Methods

![Set operations](figs/sets.png "Set operations")

* `S1.union(S2)`, `S1 | S2` — elements in S1 or S2, or both
* `S1.intersection(S2)`, `S1 & S2` — elements in both S1 and S2
* `S1.difference(S2)`, `S1 - S2` — elements in S1 but not in S2
* `S1.symmetric_difference(S2)`, `S1 ^ S2` — elements in S1 or S2 but not both

In [196]:
st1 = set('homophily')

st2 = set('heterophily')
print(st1 ^ st2)

{'r', 't', 'm', 'e'}


## Dictionary Operations: Indexing

* Dictionaries are indexed by keys

In [197]:
mydic = {'Howard': 'aerospace engineer', 'Leonard': 'physicist', 'Sheldon': 'physicist', 
         'Penny': 'waitress', 'Raj': 'astrophysicist'}
print(mydic['Raj'])
print(mydic['Leonard'])

astrophysicist
physicist


## Sequence Operations: Indexing

* Lists, tuples, and strings are indexed by numbers

In [3]:
'ABCDEFG'[2]


'C'

## Indexing in Python starts from 0!

![Indexing sequences](figs/indexing.jpg "Indexing sequences")

Source: https://devrant.com/rants/1798534/array-start-from-zero-d

## Sequence Operations: Indexing

* Use `elem[index]` to extract individual sub-elements

In [198]:
print( 'abc'[0] ) 
print( ('a', 'b', 'c')[-1]) # use negative numbers to index from the end
print( ['a', 'b', 'c'][3]) #there's no element for [3]

a
c


IndexError: list index out of range

## Sequence Operations: Slicing

* Use `elem[start:end]` to get sub-sequence starting from index `start` and ending at index `end-1`

In [38]:
ls = [10, 20, 30, 40, 50]
print( ls[1:5] )  #so exclusive of index 4
print( ls[:3] ) # 0이면 0적을필요 없음
print( ls[1:] )
print(len(ls))

ls[:] == ls[0:len(ls)]
print (ls[::-1])
#x[start:end:step]

[20, 30, 40, 50]
[10, 20, 30]
[20, 30, 40, 50]
5
[50, 40, 30, 20, 10]


## Sequence Operations: Extended Slices

* Use `elem[start:end:step]` to get sub-sequence starting from index `start`, in steps of `step`, ending at index `end-1`

In [23]:
ls = [10, 20, 30, 40, 50]
print( ls[::2] ) # get elements with even indeces
print( ls[::-1] ) # get elements in reverse order


[10, 30, 50]
[50, 40, 30, 20, 10]
[10, 20, 30, 40, 50]


## Indexing and Slicing: Exercise

In [55]:
# Use indexing and slicing to create a new string that contains 
# the 2nd, 4th, 5th, 6th, and last characters from the string below
#이거 답 확인할것!

s = 'abcdefghijklmnopqrstuvwxyz'
s = list(s)
print(len(s)+1)
print(s[len(s)-1])
print(s[1], s[3], s[4], s[5], s[len(s)-1])




27
z
b d e f z


## More Sequence Operations

In [24]:
tup1 = 3 * (1,) # Notice that tuple of length 1 needs comma!
tup2 = tup1 + (2, 2) # Concatenate the two elements
print(tup1, tup2)

print( max(tup2) ) # or min()
print( sum(tup2) )
print( tup2.count(1) )
print( tup2.index(2) )

(1, 1, 1) (1, 1, 1, 2, 2)
2
7
3
3


* Why use tuples? 
    * ⚡️ They use less memory than lists
    * They can be used as dictionary keys; lists can't

## Mutability

* Immutable types: `str`, `tuple`, and all scalars
* Mutable types: `list`, `set`, `dict`

**Objects of mutable types can be modified once they are created.**

In [25]:
dic = {1:'a', 2:'b'}
dic[3] = 'c'
print(dic)

ls = [5, 4, 1, 3, 2]
ls.sort()
print(ls)

{1: 'a', 2: 'b', 3: 'c'}
[1, 2, 3, 4, 5]


## Mutability Can Be Quite Convenient

There are several useful list methods, see http://docs.python.org/3/library/stdtypes.html#mutable-sequence-types:

* `L.append(e)`
* `L.insert(i, e)`
* `L.remove(e)`
* `L.extend(L1)`
* `L.pop(i)`
* `L.sort()`
* `L.reverse()`

In [13]:
ls1 = [1, 2, 3]
ls1.append(4)
print(ls1)
ls1.extend([5, 6, 7, 8, 9, 10])
print(ls1)

[1, 2, 3, 4]
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]


## Mutability Can Also Be Dangerous

In [14]:
ls1 = [1, 2, 3]
ls2 = [4, 5, 6, 7]

ls1.append(ls2)
print(ls1)

ls2.extend([8, 9, 10])
print(ls1)

[1, 2, 3, [4, 5, 6, 7]]
[1, 2, 3, [4, 5, 6, 7, 8, 9, 10]]


## Aliasing vs. Cloning

![Aliasing](figs/aliasing.png "Aliasing")

In [16]:
ls1 = [1, 2, 3]
ls2 = ls1[:]  # Using [:] is one way to clone

ls1.reverse()
print(ls2)

[1, 2, 3]


## List Methods:  `append` vs. `extend`

In [29]:
mylist = [1, 2, 3, 4]
mylist.append(5)
print(mylist)

mylist.extend([8, 7, 6])
print(mylist)

[1, 2, 3, 4, 5]
[1, 2, 3, 4, 5, 8, 7, 6]


## List Methods: `remove` vs. `pop`

In [13]:
mylist = [1, 2, 3, 4]

mylist.remove(1)
print(mylist)

popped = mylist.pop(1)
print(popped, mylist)

[2, 3, 4]
3 [2, 4]


## List Methods: `L.sort()` vs. `sorted(L)`

In [31]:
mylist = [4, 5, 2, 1, 3]
mylist.sort()  # Sorts in-place. It is more efficient but overwrites the input.
print(mylist)

mylist = [10, 9, 6, 8, 7]
sorted(mylist) 
print(mylist)

newlist = sorted(mylist)  # Creates a new list that is sorted, not changing the original.
print(mylist, newlist)

[1, 2, 3, 4, 5]
[10, 9, 6, 8, 7]
[10, 9, 6, 8, 7] [6, 7, 8, 9, 10]


## Aliasing and Cloning: Exercise

What will the following program print?

```
ls1 = [11, 3, 8, 6, 6, 1]
ls2 = ls1
ls2[2] = 0
print(ls1)
```

* (A) [11, 3, 8, 6, 6, 1]
* (B) [11, 0, 8, 6, 6, 1]
* (C) [11, 3, 0, 6, 6, 1]
* (D) 0

## Data Types in Python


| Type     | Scalar     | Mutability | Order   
| :------: |:----------:|:----------:| :---------:
| `int`    | scalar     | immutable  |             
| `float`  | scalar     | immutable  |  
| `bool`   | scalar     | immutable  | 
| `None`   | scalar     | immutable  | 
| `str`    | non-scalar | immutable  | ordered
| `tuple`  | non-scalar | immutable  | ordered
| `list`   | non-scalar | mutable    | ordered
| `set`    | non-scalar | mutable    | unordered
| `dict`   | non-scalar | mutable    | unordered

* Objects have types
* Objects have methods

-------

* **Lab**: Lists, lists, lists (and some strings)
* **Next week**: Control flow in Python