# Basic concepts and data types

### 1. Variable Names

Except for certain reserved names, you can practically use any alphabetic and/or numeric combinations, you can also use underscore to split up words, for better readability. Note that it's case sensitive in most coding languages, including Python.
    
Note that you cannot have any blank space in the characters. Thus we use usually use underscore to represent spaces.
    
You must always declare a variable before you call it and use it. Thinks of it as giving that varible a specific value before you are allowed to use it later, or more straight forward, always do "x = " something before you can later use 'x'. You can initialize it to 0 or None etc. if you don't want to give it a value at first.

In [30]:
num_of_cat = 0
num_of_dog = None

num_of_cat = 2
num_of_dog = 2 + 1

# You can give a variable value using other variables as well, you don't have to use explicit numbers
num_of_animal = num_of_cat + num_of_dog

print ("number of animals:",num_of_animal)

number of animals: 5


The reserved names will often be highlighted in your code editor. For example, 'sum', 'min', 'print', 'list' etc. These names are usually for built-in functions or types, thus you shouldn't (you can, but it's very bad coding practice, unless you're a pro and you really want to redefine that built-in name for good) use them to define your variable names. 

Notice the different coloring here as compared to self-defined variables.

In [49]:
x = 2
print ("type of x:", type(x), ", x =",x)

# Bad practice here:
max = 3
print (max)

# If you now want to use max() to find maximum, it'll fail, because you redefined it as an integer here for good.
# Everywhere else in this particular code, max() will forever become an integer and lose its original functionality

type of x: <class 'int'> , x = 2
3


### 2. Basic data types, convertion and operation between different data types

The data types that you'll use most common are int (integer), float (decimal number), string. They are often interchangeble. Another data type you might use (non explicitly) is boolean, which is either a True or a False.

In [107]:
x = 1
print ("type of x:", type(x), ", x =",x)

y = str(x)
print ("type of y:", type(y), ", y =",y)

z = float(x)
print ("type of z:", type(z), ", z =",z)

type of x: <class 'int'> , x = 1
type of y: <class 'str'> , y = 1
type of z: <class 'float'> , z = 1.0


You can also create strings by using '' and "", which is equivalent. Initilizing an empty string can be useful when you want to build things from scratch, similar to when you want to initialize a counter to 0 when you want to sum, say 1~10.
Note that addition between strings means to concatenate them, i.e. linking them into a longer string.

In [111]:
x = ""
y = ''
print (x == y)

word_list = ['cat','is','better','than','dog']

for word in word_list:
    x += word
    x += " "   # add a space between words

print (x)


True
cat is better than dog 


Note that when you print a string that is a number, it'll look the same as if it's a number, but they are of different types as far as the computer is concerned, you cannot do '1'+1 and try to get 2 out of it, see error below:

In [53]:
x = 1
y = '1'
z = x+y

TypeError: unsupported operand type(s) for +: 'int' and 'str'

You can see the TypeError above, the addition between 'int' and 'str' is forbidden.

If you want to 'add' them, you'll need to convert them into data types that can operate together, you can either:
1. convert int to string, then add
2. convert string to number, then add

Notice the difference in the code output below.
1. Converting string into a number and perform addition, is the addition in mathematical sense
2. Converting number into string, and perform addition, is called a string concatenation

Note you can only convert pure numerical string into numbers, you cannot convert 'icecream' directly into a number, although there are ways, it's beyond the scope you'll be working with.

In [70]:
x = 1     # x is an int
y = '1'   # y is a string

A = x + int(y)
print ('A = ', A)
print ('type of A:', type(A))

# Note that the result of int + float automatically becomes a float,
# you can tell from the output having a decimal point.

A = x + float(y)
print ('A = ', A)
print ('type of A:', type(A))

# Also note that Python execute your code from the first line to the last line, so even when you defined
# the variable 'A' above as an number '2', you can define it again as a string, and still not 
# messing up the previous print result at line 5, because up till line 5, A is still an integer 2, so when 
# the print executes at line 5, it doesn't know what's happening at line 6 and below.

A = str(x) + y
print ('A = ', A)
print ('type of A:', type(A))

# When you pass a value into a variable using the equal sign "=", Python (as do most other languages) first stores
# the right hand side into a temporary buffer, and then pass that buffer's value to the left hand side. 
# This is illustrated in lines 31~32.
A = "I have " + A + " cats."
print ('A = ', A)

A = str(x) + y # Reset A back into "11", because after line 27, A is "I have 11 cats."
buffer = "I have " + A + " cats."
A = buffer

# The above buffering mechanism is the reason why you can do things like:
sums = 0  # note that I named my variable sums instead of sum, to avoid using the reserved name sum()
for i in range(5):
    sums = sums + i   # Or equivalently: sum += i
print (sums)

# Because at every new loop instance, the right hand side's "sums" equals to the result from the 
# previous loop instance, so basically what it's doing here is adding i into variable "sums"
# Note that in coding, things usually start from 0, range(5) gives you [0,1,2,3,4], it doesn't include 
# the number 5, this can be easy to forget when you're not used to it.


A =  2
type of A: <class 'int'>
A =  2.0
type of A: <class 'float'>
A =  11
type of A: <class 'str'>
A =  I have 11 cats.
10


There's an exception where you can multiply string and int (but not a float), it's a shorthand for repeating the string multiple times.

In [162]:
z = 'Cat'
print (z*5)

CatCatCatCatCat


There's also a tuple data type, it's like an array, but its immutable, meaning you can't change the element on a given index once you created it. You probably won't be using tuples much, but it can be useful when you want a hashable 'list'. More about hashing in later content.

In [171]:
num = [1,2,3]
tup1 = tuple(num)
tup2 = (5,6,7)

print (tup1)
print (tup2)

num[1] = 20 

print (num)

# Now, you cannot do tup1[1] = 20 because tuple is immutable, you can try this and it'll give you an error,
# you can re-define tup1 in case you need to modify it. 

# You can access tuple elements using indexing same as you would with a list.

# You also cannot use pop() on a tuple. A tuple is meant to be stable.
# Note that string is also immutable in the same sense as tuple.

tup1 = tuple(num)

print (tup1)

print (tup1[1])



(1, 2, 3)
(5, 6, 7)
[1, 20, 3]
(1, 20, 3)
20


### ? Exercise:
Using a single for loop and the 2 lists given, create the below three strings and print them out, don't forget the spaces. 
1. "3 cow 2 sheep 6 duck "

2. "9 cow 4 sheep 36 duck "

3. "3333 cow 2222 sheep 6666 duck ", could you utilize string and integer multiplication in this case?

Access list elements via index, eg. x[0] = 3, y[1] = 'sheeps'

In [None]:
x = [3,2,6]
y = ['cows','sheeps','ducks']

# Your code here:


### 3. Basic data structures

List (Python's term for array)

Set (Python's term for hash list)

Dictionary (Python's term for hash table)

You can convert set, list between each other, just like you can convert int into float.

#### List is the most fundamental data structure (storage) you'll use:

In [161]:
# Initialization, these two methods are equivalent
x = list()
x = []

# Method append(), meaning add values to the tail:
# To use methods, use a 'dot' after the variable name, and add the method's name (with the parenthesis)
# This way of calling methods is similar to when you were using numpy.arange()
x.append(1)
x.append(2)
x.append(3)

print ("1. ",x)

for i in range(4,11):
    x.append(i)

print ("2. ",x)

# Method extend(), this is similar to append(), but can take a iterable as arguement
x.extend({11,12,13})

print ("3. ",x)

# Method pop(), removes and returns the last element of the array
print ("4. ",x.pop())

# Now you not only made x.pop() output its last element, which WAS '13', it also removed this element.
print ("5. ",x)

# You can access elements by using its index, note that Python index counts from 0.
# So x[0] is the first element of list x, x[5] is the 6th element.
print ("6. ",x[0], x[5])

# On a side note, len() can be used to check the length (size) of an iterable, 
# such as list, string, set, dictionary, tuple.
# Note that len() is a built-in function, it's not a method exlusive to list.
print ("7. ", len(x))
print ("8. ", len("Cats"))

1.  [1, 2, 3]
2.  [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
3.  [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13]
4.  13
5.  [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]
6.  1 6
7.  12
8.  4


There're many other neat tricks for lists that you probably won't use here:
https://www.techbeamers.com/python-list/

If you're interested you can read this in the future, but for the moment don't confuse yourself with extra information.

#### Hash
In python, only hashable objects can be used as elements for a set, or the keys of a dictionary. You can google what hash function and hashing is if you're interested, hashing is also the foundation of cryptocurrency, which relies on super costly hashing to guarantee its security.

As far as you're concerned right now, common hashable data types / structures in Python are: string, int, float, tuple. These are the only things you can use as the key for a dictionary. You cannot use list, set, dictionary as keys for dictionary.

#### Set
There are two main difference between list and set:

1. Set doesn't support indexing, for eg. x = set(), you cannot access x[0] anymore. Because of this property, elements in set are unordered, you can still use method pop() to access elements in set, but the outcome is usually random.

2. Set doesn't have repeated elements, say a set is x = {1,2,3}, adding another 1 into x won't change it at all.

Although it seems inconvinient to use set, since its elements are unordered, the advantage is that checking if an element exist in it (inquiry) is super fast, it can be useful to use set when your code performs a lot of inquiries. If a set contains n elements, compared to a list of the same n elements, it'll be n times faster to ask if something is in set compared to asking the same question with a list, and in real life computation, n can be huge. This property is achieved by hashing.

### ? Exercise:
1. Create a list A that contains all the positive even number <= 50.
2. Create another list B that contains all the multiples of 7 <= 50.
3. How many elements are in each list? Can you do this without using len()?
4. Create a new list C that is the combination of all the elements in A and B. How many elements are there in list C?
5. Change list C into a set by using the build-in set() function, you can name the new set D, how many elements are there in set D? 
6. Print out list C and set D, what are the changes?

Write your code in the cell below to help you answer these questions.

In [None]:
# Your code here:




#### Dictionary

In [141]:
x = {}   # Initialize empty dictionary
x['cat'] = 1
x[1] = 2.2
x['1'] = 'cat'
x[(1,2)] = 'sheep'
x[2.1234] = [1,2,3]

Notice that you can use string, int, float, tuple as the keys, for the values, you can put almost anything you want in there, it can be numbers, string, even lists and dictionaries and more complex things. You can even create a multi-layer dictionary:

In [146]:
x = {}
x[1] = {}
x[1][2] = {}
x[1][2][3] = 'iron man'
print (x)

{1: {2: {3: 'iron man'}}}


### ? Exercise:
1. What are the keys and corresponding values of the above dictionary? You can play with the code to answer this or write your own.
2. Create a dictionary named 'letter_count', for every word in the list 'words_list', create a dictionary entry with the word itself as key, and its letter count as the corresponding value, for eg. you should have an entry like 'proton':6
3. Create another dictionary named 'inverse_letter_count', this time use the letter count as the key, and the word itself as the value.
4. Print these two dictionaries you just created.

In [172]:
words_list = ['proton','neutron','dripline','Schrodinger']

# Your code here:


### 4. Common Operators

You can read all about Python operators here: https://www.w3schools.com/python/python_operators.asp

#### The common arithmetic ones are 
+, - , *, /, %, //

Double slash means integer division, eg. 5//2 = 2, it outputs the integer directly smaller than your result.

x%y is x mod(ulus) y, which is the "distance" between the largest multiple of y that's smaller than x, 9%5 = 4 because 9 = 5*1+4, 13%5 = 3 because 13 = 5*2 + 3

Note that x = (x//y)*y + x%y

In [89]:
x = 20
y = 7

print ("normal division:",x/y)
print ("integer division:",x//y)
print ("modulus:",x%y)
print ( (x//y)*y + x%y )


normal division: 2.857142857142857
integer division: 2
modulus: 6
20


#### The common logical ones are
not, and, or, in

The first 3 is used between boolean:

In [93]:
x = True
y = False
print (not x)

print (x or y)

print (x and y)

print (x and not y)

False
True
False
True


'in' is commonly used to check (returns a boolean) or access items in an iterable data structure (when used together with a for loop), note that string is also an iterable, in the sense that you can think of it as a list.

In [156]:
# Check if something is 'in' an iterable
sentence = "A cat is not a dog. "
print ("cat is in the sentence:","cat" in sentence)   # returns True because The string "cat" is in sentence 
print ("mice is in the sentence:","mice" in sentence)  # returns False

nums = [1,2,3,4,5]
print ("1 is in the list nums:",1 in nums)   
print ("15 is in the list nums:",15 in nums)

cat is in the sentence: True
mice is in the sentence: False
1 is in the list nums: True
15 is in the list nums: False


In [178]:
# Accessing items in iterables
for i in sentence:
    print (i)

for i in nums:
    print (i)

A
 
c
a
t
 
i
s
 
n
o
t
 
a
 
d
o
g
.
 
1
2
3
4
5


In [152]:
animals = {'mice': 2, 'chicken':4, 'human': 5, 'cat':1}
# When doing 'in' on dictionary, you're actually accessing dictionary's set of keys.
# The set of keys is {'cat', 'mice', 'chicken','human'}.
# The set of values are {1,2,4,5}

for i in animals:
    print (i)

# The above is equivalent to:
for i in animals.keys():
    print (i)
    
for i in animals.values():
    print (i)

mice
chicken
human
cat
mice
chicken
human
cat
2
4
5
1


### ? Exercise:
Z is a list of proton numbers, N and R are lists of corresponding neutron numbers and radius in unit fm.

By corresponding I mean for nucleus with proton number Z[0] and neutron number N[0] has radius R[0].

Create a dictionary, you can do it any way you want, you need to achieve the following goal:
1. Given the information of proton number Z and neutron number N, use the dictionary to output the corresponding radius.
2. Print out all key: value pairs of your dictionary using for loops.

Recall which data types can be used as the keys here?

In [None]:
Z = [20,23,28,30]
N = [30,32,35,45]
R = [4.37, 4.5, 4.71, 4.99]

radius_dict = {}
# Your code here:


# Goal, print out the radius for Z = 23, N = 35
print (radius_dict[] )    #<-- you need to fill in the key depending on how you defined it

# 


# For loops
### 1. Basics

The basic structure of a for loop statement is:

for  #variable#   in   #iterable#:

    block of code

The iterable can be:
1. String, which will be treated like a list, every iteration will access one character in the string,
2. List, tuple, set, I believe you're familiar with list at this point, set and tuple is almost identical here,
3. Dictionary, but here you're actually using dictionary.keys() as iterable, Python automatically interpret a dictionary at the place for the iterable means that the user wants to iterate through the set of keys. Be careful here that the set of keys doesn't have any ordering to it, it can be random, if you don't mind the ordering, it doesn't make a difference,
4. Specialized iterable such as the output of range(), numpy.arange(), they are list-like objects, but they are not exacly lists, however you can convert them into list by applying the list() function on them. In the context of being used as the for loop's iterable, you can imagine it as a list.

You can name the variables anything you want, you don't have to use i, it's just a dummy index people are used to.

The variable will remain the same (unless you explicitly change its value within the indented block of code, but this is considered a bad practice) through out the block of code, the block of code can be anything, it can be a whole new code containing its own for loops, which can again contain another for loop, which we call nested for loops. 

The block is recognized using indentations in Python, as long as the lines have the same indentation, they belong to the block.

Don't forget the colon at the end of the for statement.



In [210]:
# 1. String as iterable

sent = "It's raining."
# Usually you can define a string using either a pair of "", or ''. Only in the case where you need to actually 
# have a ' in your string, should you use "" for the definition.

rev = ''
for char in sent:
    if char == " ": 
        rev += ' not'
    rev += char
    
print ("1. ",rev)

# 2. List as iterable
nums = list(range(5))

print ("2. ",nums)

# You can even use a single underscore to represent the variable
sums = 0
for _ in nums:
    sums += _
    
print ("3. ",sums)

sums = 0
# You can change the variable instance to other values within the for loop, but it's not recommended
# as you could forget that you changed it at the beginning, if you have a big block of code.
for i in nums:
    i = 0
    sums += i

print ("4. ",sums)

print ("\n5.")
# 3. Set as iterable
nums_set = set(nums)

for dogs in nums_set:
    print (dogs)

print ("\n6.")
# 4. Dictionary (by default its keys()) as iterable:
d = {1:'car', 'Audi': 'Germany', 'cat': 2.5, 132.123: 'random number', 'list': [1,3,5,7] }

for k in d:
    print ("variable =",k, "is being used as key, corresponding value:", d[k])

1.  It's not raining.
2.  [0, 1, 2, 3, 4]
3.  10
4.  0

5.
0
1
2
3
4

6.
variable = 1 is being used as key, corresponding value: car
variable = Audi is being used as key, corresponding value: Germany
variable = cat is being used as key, corresponding value: 2.5
variable = 132.123 is being used as key, corresponding value: random number
variable = list is being used as key, corresponding value: [1, 3, 5, 7]


### 2. Nested for loops
Best illustrated with a few examples:

In [228]:
for i in range(3):
    print ("Starting the outer loop of i =",i,'\n')
    for j in ['cats','dogs']:
        print ("\tStarting the inner loop of j =",j)
        print ("\t\tDoing something useful here: ",i,j)
        print ("\tEnding the inner loop of j =",j,'\n')
    print ("Ending the outer loop of i =",i,'\n****\n')

Starting the outer loop of i = 0 

	Starting the inner loop of j = cats
		Doing something useful here:  0 cats
	Ending the inner loop of j = cats 

	Starting the inner loop of j = dogs
		Doing something useful here:  0 dogs
	Ending the inner loop of j = dogs 

Ending the outer loop of i = 0 
****

Starting the outer loop of i = 1 

	Starting the inner loop of j = cats
		Doing something useful here:  1 cats
	Ending the inner loop of j = cats 

	Starting the inner loop of j = dogs
		Doing something useful here:  1 dogs
	Ending the inner loop of j = dogs 

Ending the outer loop of i = 1 
****

Starting the outer loop of i = 2 

	Starting the inner loop of j = cats
		Doing something useful here:  2 cats
	Ending the inner loop of j = cats 

	Starting the inner loop of j = dogs
		Doing something useful here:  2 dogs
	Ending the inner loop of j = dogs 

Ending the outer loop of i = 2 
****



In [229]:
for i in range(1,3):
    for j in range(1,3):
        for k in range(1,3):
            print (i,j,k)

1 1 1
1 1 2
1 2 1
1 2 2
2 1 1
2 1 2
2 2 1
2 2 2


### ? Exercise
1. For all positive multiples of 3 smaller than 50, and all positive multiples of 7 smaller than 50, create a list of tuples for all possible pairs. For eg., your list should contain (3,7), (3,14), (3,21) ... (6,7), (6,14) ...

2. Do this again, now for all positive multiples of 2 smaller than 50, all positive multiples of 7 smaller than 50. For eg., your list should contain (2,7), (2,14), (2,21) ... (4,7), (4,14) ... (6,7), (6,14) ...

3. Create a new list that's a combination of the above 2 lists, but make sure there's no duplicates, for eg. you'll have (6,7) in both lists from list1 and list2, make sure this pair appears in the final list only once.


In [None]:
# Your code here:



# Functions
### Definition

def  function_name(#input variables#):
    
    block of code
    
    return stuffs (or don't return anything)


Things to know:
1. Make sure you function name doesn't conflict with your variables
2. You can have as many different inputs as you want,
3. You can input almost any data type / structure you want, can be a string, int, float, list, dictionary etc.
4. As a good habit, make sure whatever variables that you want to use that is not defined inside the function, make it a input, I'll illustrate this later.
5. You don't have to return, it's optional, for example you can choose to modify the input and return nothing.

In [256]:
# Basic type with return

def sum_with_weights(inp1,inp2,inp3):
      return inp1 + 2* inp2 + 3* inp3
    
# x = 1 + 2* 2 + 3* 3 = 14
x = sum_with_weights(1,2,3)

print ("1. ",x)

# Without return, this is used for certain data types only, to perform what's called an in-place modify,
# you probably won't be using this but it's useful to know.

def array_addition(a,b):
    x = a
    for i in range(len(x)):
        x[i] += b
    # no return needed
    
a = [2,3]
b = 20
print ("2. a=",a,"\tb=",b)
exchange(a,b)
print ("3. a=",a,"\tb=",b)


# Here there's a variable i which you didn't define inside the function, and you also didn't make it an input
# It might still work sometimes, but this is bad practice.
def square(x):
    return i* x**2

print ("\n4.")
for i in range(2):
    print (square(2))
    
# Notice that it can still print 0* 2**2 = 0 and then 1* 2**2 = 4, this can work because you have a variable i
# in the for loop. 

print ("\n5.")
# However if you change i into j, you'd fail, because now i is locked to i = 1 from the previous code 
# in the for loop at line 33, the last value given to i is i = 1, so now i is 1 forever until you redefine it
# Now when you try to print 0 and 4, you'll actually get 4, 4.
for j in range(2):
    print (square(2))

# To avoid this, simply make i an input in the function definition, and provide that input everytime 
# you use the function
def square2(i,x):
    return i* x**2

# You can see i is still stuck with i = 1 from line 33, we didn't give it a new value
print ("6. ",i)

# But when you repeat lines 43~44 with the new function square2(), the first variable passed into square2() will be 
# the value of square2()'s i, so even if I pass square2(monkey,2), as far as square2() is concerned, i = monkey, so
# i's value change with monkey, which is 0, then 1 due to the for loop.
print ("\n7. ")
for monkey in range(2):
    print (square2(monkey,2))
    
# Now we arrive at the same result as the original lines 33~34, which we got lucky because we used a i in
# the for loop, and square() shared the variable i


1.  14
2. a= [2, 3] 	b= 20
3. a= [22, 23] 	b= 20

4.
0
4

5.
4
4
6.  1

7. 
0
4


### ? Exercise
1. Create a function to calculate the distance between two points with coordinates of the form (x,y). Print the distance.

2. Create a function that finds the minimum of a list. Do not use the built-in min() function. Thinks about how you would scan through a long list in real life (not with code) and find the minimum? What would you keep track of?

3. Given a list 'points', create a function to find the point that's closest to the point at (13,23). Utilize the functions you just created above. This new function should take the list 'points' as input.

4. Create a function to find the average distance of the points in the list 'points' to the origin (0,0).

In the above tasks, you'll probably be using functions inside of functions, just like nested for loops.

In [257]:
# Task 1
p1 = (-23,39.6)
p2 = (31.7,58.2)
# Your code here:

# Task 2: create your own min() function, don't name it min(), because that name is reserved
# Your code here:

# Task 3: finding min distance in a list
points = [(1,2),(2,33),(-14,5),(22,-3),(1.2,56),(10.2,14.5),(-13,81),(50,55),(23,11),(0,0)]
target = (13,23)
# Your code here:

# Task 4: finding average
# Your code here:
