# Python Basics
By Shuhei Kitamura

### Outline<a id='top'></a>
1. [Hello World](#sec1)
2. [Arithmetic Operation](#sec2)
3. [Variables and Objects](#sec3)
4. [Types](#sec4)
    1. [Basics](#sec4_1)
    2. [Lists](#sec4_2)
        1. [Making a list](#sec4_2_1)
        2. [Accessing items](#sec4_2_2)
        3. [Checking items](#sec4_2_3)
        4. [Changing items](#sec4_2_4)
        5. [Adding items](#sec4_2_5)
        6. [Deleting items](#sec4_2_6)
    3. [Dictionaries](#sec4_3)
        1. [Making a dictionary](#sec4_3_1)
        2. [Accessing items](#sec4_3_2)
        3. [Checking keys](#sec4_3_3)
        4. [Checking values](#sec4_3_4)
        5. [Changing and adding items](#sec4_3_5)
        6. [Changing keys](#sec4_3_6)
        7. [Deleting items](#sec4_3_7)
    4. [Tuples](#sec4_4)
        1. [Making a tuple](#sec4_4_1)
        2. [Accessing and checking items](#sec4_4_2)
        3. [Changing, adding, and deleting items](#sec4_4_3)
5. [Loops and If Statements](#sec5)
6. [Functions](#sec6)
7. [Methods](#sec7)
8. [Modules and Packages](#sec8)
9. [NumPy](#sec9)
    1. [Vectorized computation and ufuncs](#sec9_1)
    2. [Making an array](#sec9_2)
    3. [Type of items in an array](#sec9_3)
    4. [Accessing items](#sec9_4)
        1. [Masking](#sec9_4_1)
        2. [Fancy indexing](#sec9_4_2)
    5. [Checking dimension, shape, size, and length of an array](#sec9_5)    
    6. [Changing arrays](#sec9_6)
        1. [Changing items](#sec9_6_1)
        2. [Adding items](#sec9_6_2)
        3. [Concatenating arrays](#sec9_6_3)
        4. [Deleting items](#sec9_6_4)
        5. [Reshaping an array](#sec9_6_5)
    7. [Summarizing items](#sec9_7)
10. [Pandas](#sec10)
    1. [Series](#sec10_1)
        1. [Making a series](#sec10_1_1)
        2. [Accessing items](#sec10_1_2)    
    2. [DataFrame](#sec10_2)
        1. [Making a dataframe](#sec10_2_1)
        2. [Accessing items](#sec10_2_2)
        3. [Checking dimension, shape, size, and length of a dataframe](#sec10_2_3)
        4. [Changing and adding items](#sec10_2_4)
        5. [Concatenating dataframes](#sec10_2_5)
        6. [Deleting items](#sec10_2_6)
        7. [Changing column names/indices](#sec10_2_7)
        8. [Sorting items](#sec10_2_8)
        9. [Transposing a dataframe](#sec10_2_9)
        10. [Summarizing items](#sec10_2_10)

## 1. Hello World<a id='sec1'></a>
- Programming is fun. We start the tutorial with a famous example of `"Hello World"`.
- Type `"Hello World"` and execute (Shift + Enter or push "Run Cells" button above).
- Next, do the same for `print("Hello World")`. This tells Python to print `"Hello World"`.
- Does `Hello World` (without quotation marks) work?

[back to top](#top)

- If you want to write a comment, rather than the code to execute, use `#` mark in front of texts.
- Try `print(1 + 2)` with and without `#` mark. Did you get the same result?

## 2. Arithmetic Operation<a id='sec2'></a>
- Arithmetic operations are possible in Python.
- Main operators are `+`, `-`, `*`, and `/`.
- Write `1 + 2` and execute. 
- Next, do the same for `print(1 + 2)`. Any difference?

[back to top](#top)

- `**` for (mathematical) power. Power is right associative. 
- Compare `-2 ** 4` and `(-2) ** 4`.
- Compare `3 ** 3 ** 3` and `(3 ** 3) ** 3`.

- `%` for modulus (the remainder from the division) and `//` for floor division (integer division).
- Calculate `7 % 2`. 
- Calculate `7 // 2`. Compare it with `7 / 2`.

Exercise 1: Compute the following:
$$
5 \times \left( \frac{-3}{2} \right)^{2} 
$$

## 3. Variables and Objects<a id='sec3'></a>
- You make a variable and assign data to it. The data are called **objects** in Python.
    - A (built-in) object has an identity, a type, a value, and methods.
- For example, `x = 1` creates a variable with a name `x`, and assigns object `1` to it.
- Type `y`. You should get an error. Why?

[back to top](#top)

- Once you define a variable, you can use it later in your code. 
- Write:
```python
y = 1 + 2
```
- Then add `3` to `y` and print it.

- Tip: You can make more than one variable at once.

In [None]:
a, b = 1, 2
c = d = 1

- Tip: Similarly, you can print several values at once.
    - This is one of the nice features about Python!

In [None]:
x, y = 1, 2
print(x, y)
print(x); print(y)

Exercise 2: Make variables named `x` and `y`, and assign `"Hello"` and `"World"` to them, respectively. Print them.

## 4. Types<a id='sec4'></a>
- An object has a type. Major types are: **float**, **integer**, **string**, **boolean**, **list**, **tuple**, and **dictionary**.
- We will go through them one by one.

[back to top](#top)

### A. Basics<a id='sec4_1'></a>
- Let's start from basic four types: float (or a floating-point number), integer, string, and boolean.
    - Numeric types are float, integer, and complex. We will not talk about complex in this class.
    - There used to be another type, long, in Python 2.x, which no longer exist in Python 3.x. 
- You can use `type()` to check the type of an object.
- Run the following code. What did you get?

[back to top](#top)

In [None]:
print(type(1.0))
print(type(1)) 
print(type("True"))
print(type(True))

- A string is a sequence of characters like `"Hello"`.
- You define a string using quotes `'...'` or `"..."`.
    - "In Python, single-quoted strings and double-quoted strings are the same. This PEP does not make a recommendation for this. Pick a rule and stick to it." (See [PEP](https://www.python.org/dev/peps/pep-0008/#string-quotes).)
- Tip: If a string includes a quote like `"This is a "quote""`, it returns an error. In such a case, write like `'This is a "quote"'` instead. That is, start and end a string using single quotes. Try both.

- What is a boolean? Check the outcomes of `1 + 2` and `1 + 2 == 3`. Also check their types.
    - `==` is a **relational operator** (or a comparison operator), which means "equal to" and returns `True` or `False`, i.e., a boolean. (You will learn about relational operators later.)

- There are relationships between booleans and numeric numbers (e.g., float, integer).
- `False` (boolean) is `0` (integer) or `0.0` (float).
    - `True` is `1` or `1.0`. 
- Both `0` (integer) and `0.0` (float) are `False` (boolean).
    - Both `1` and `1.0` are `True`. 
- All other objects (e.g., strings) are mostly `True`.
- What is the boolean of `0`? What about `"Hello"`?

- Tip: Booleans can be used for arithmetic computation. You can also use `+` and `*` for strings.
- Try `True + False` and `"Ha" * 3`.

- ***Important***: However, you cannot include both numeric numbers and strings at once with a `+` operator.
- Try:
```python
print("I have " + 10 + " bucks.")
```
Next, try:
```python
print("I have " + str(10) + " bucks.")  
```
- Here, `str(10)` converts a numeric number `10` to a string `"10"`.
- Or use the following code if you like:
```python
print("I have ", 10 ," bucks.")
```

- In many cases, it is possible to change the type of an object (e.g., `str(10)`).
- To change the type of an object, use, e.g., `float()`, `int()`, `str()`, and `bool()`.
- Convert `1` to float. Also check the type.
- Next, convert `"abc"` to float. Did you get an error?

- There is an object called `None`.
    - `None` means non-existence (null).
    - The type of `None` is `NoneType`. `None` is `False` (boolean).
- There is a similar object `NaN` (`np.nan`) in the NumPy package.
    - Missing values in data are often denoted as `NaN`.
    - The type of `NaN`  is float. `NaN` is `True` (boolean).
- Run the following code.

In [None]:
print(None)
print(type(None))
print(bool(None))
import numpy as np # import NumPy. you will learn how to import packages/modules later.
print(np.nan)
print(type(np.nan))
print(bool(np.nan))

Exercise 3: Check the type of `1 + True`. Is the type different from that of `1.0 + True`?

### B. Lists<a id='sec4_2'></a>
- Let's move on to lists, which are a very important type.
- A list is made of brackets `[]`.
- A list can contain any type of objects including a list.

[back to top](#top)

**a. Making a list**<a id='sec4_2_1'></a>
- You can make a list in the following way.
- Print the following lists. Then, check their types. Any difference?

[back to top](#top)

In [None]:
list1 = ["tom", 1.75, "jerry", 1.82] 
list2 = [["tom", 1.75], ["jerry", 1.82]]

- An empty list can be made in the following way.
- Print `list1`. Then, check the type.

In [None]:
list1 = []

- Tip: You can make more than one list at once.
- Run the following code.

In [None]:
list1, list2 = [1, 2], [3, 4]

- Tip: `*` operator can be used for making a list.
- What is the outcome of `["Ha"] * 4`? What about `["Ha" * 4]`?

- Tip: There are some clever ways to make a list.
- Print `list1`. Then, check the type.

In [None]:
list1 = list(range(10)) # range() is a function that produces integers. you will learn more about functions later.

**b. Accessing items**<a id='sec4_2_2'></a>
- Next, we want to access the first item of `list1` (i.e., `"tom"`). How?
```python
list1 = ["tom", 1.75, "jerry", 1.82] 
```

[back to top](#top)

- To access the first item of `list1`, type `list1[0]`, i.e., an object name + `[0]`.
    - Shouldn't it be `[1]`?
    - ***Important***: Python uses "zero-based indexing." Indices start from zero, not one!
- Access the second item of `list1`.

In [None]:
list1 = ["tom", 1.75, "jerry", 1.82]

- Next, we want to access `"tom"` in `list2`. How?
```python
list2 = [["tom", 1.75], ["jerry", 1.82]]
```
- Try `list2[0]`. What did you get? Next, try `list2[0][0]`.

In [None]:
list2 = [["tom", 1.75], ["jerry", 1.82]]

- You can also use **slicing** to access a subset of a list.
- You write like `mylist[start:end]`.
    - The rule is `mylist[inclusive:exclusive]`. That is, the end index is not included.
    - You do not always need to write `start` or `end`. You can also write, e.g., `mylist[:end]` or `mylist[start:]`. 
- Access the first three items in `list1` using slicing.

In [None]:
list1 = ["tom", 1.75, "jerry", 1.82]

- You can also use a negative index, which means the index from the end of a list.
- Access `1.75` and `"jerry"` in `list1` using a negative index and slicing.

In [None]:
list1 = ["tom", 1.75, "jerry", 1.82]

- Tip: There is a fanky way of accessing items in a list. Use `mylist[start_index::number]`.
    - This `number` indicates how many items to be skipped (e.g., `2` means that 1 item will be skipped).
- Try `list1[0::1]`. Then, try `list1[0::2]` and `list1[1::2]`.

In [None]:
list1 = ["tom", 1.75, "jerry", 1.82]

- You can also access a subset of a string using the same methods.
- Access `"jer"` in `"jerry"`.

**c. Checking items**<a id='sec4_2_3'></a>
- You can check if a list contains a specific item using `in` like `"x" in mylist`.
    - This returns `True` or `False`.
    - The `in` and `not in` are also called **membership operators**. 
- Check if `list1` contains `"jerry"`.

[back to top](#top)

In [None]:
list1 = ["tom", 1.75, "jerry", 1.82]

- Similarly, you can check if two items are the same or not by using relational operators (e.g., `==`, `!=`) or **identity operators** (`is`, `is not`).
    - You will learn more about such operators later.
- You type like `mylist[0] == mylist[1]`.
- Check if the first and third items are the same in `list1` using a relational operator.

In [None]:
list1 = ["tom", 1.75, "jerry", 1.82]

**d. Changing items**<a id='sec4_2_4'></a>
- Once you access an item, you can change it.
- Before doing so, it is worthwhile to mention the difference between mutable and immutable objects.
    - **Immutable objects**: You cannot change an object itself (**float**, **int**, **str**, **bool**, and **tuple**).
    - **Mutable objects**: You can change an object itself (**list** and **dict**).
    
[back to top](#top)

- A list is a mutable object. To change an item in a list, type `mylist[index] = new_item`.
- Change `"tom"` to `"spike"` in `list1`. Then, print it.

In [None]:
list1 = ["tom", 1.75, "jerry", 1.82]

- To see the difference between mutable and immutable objects, try the following:
    - Recall: Strings are immutable objects.

In [None]:
"tom"[2] = "l"

- But recall that you can do the following change even for immutable objects.
    - Recall: Booleans are immutable objects.
- In this case, a new object is assigned, rather than changing the object itself.

In [None]:
print(float(True))
print(id(float(True)), id(True)) # check if float(True) and True have the same identity

- ***Important***: For mutable objects, once you declare that two objects are identical, changing a list automatically changes the other.
- Print `list1`. What did you find?

In [None]:
list1 = ["a", "b", "c"]
list2 = list1 # declare that list1 and list2 are identical
print(id(list1), id(list2)) # show that two objects have the same identity
list2[1] = "d" # change only an item in list2

- If you want to avoid such an automatic replacement, type instead `mylist2 = mylist1[:]`.
- Try the same example, but using this alternative method.

In [None]:
list1 = ["a", "b", "c"]

**e. Adding items**<a id='sec4_2_5'></a>
- To add an item to a list, use `mylist.append(item)`.
    - This `.append()` is a **method**. You will learn about methods later.
- Add `"spike"` to `list1`.

[back to top](#top)

In [None]:
list1 = ["tom", 1.75, "jerry", 1.82]

- Alternatively, you can do the same thing using a `+` operator.
- Add `"spike"` to `list1` using a `+` operator.

In [None]:
list1 = ["tom", 1.75, "jerry", 1.82]

**f. Deleting items**<a id='sec4_2_6'></a>
- Deleting items from a list is like taking a subset of the list.
- Remove `"tom"` from `list1` by taking a subset.
- Alternatively, you can use `mylist.remove(item)` or `del mylist[index]`. Repeat the same thing using these methods.

[back to top](#top)

In [None]:
list1 = ["tom", 1.75, "jerry", 1.82]

Exercise 4:
1. Make a list of integers from 1 to 20 and name it `list1`. Print it.
2. Check the type of the fifth item of `list1`.
3. Convert `10` to `"10"` and print `list1`.

### C. Dictionaries<a id='sec4_3'></a>
- The remaining types are dictionaries and tuples.
- A dictionary is made of braces `{}`.
- A dictionary contains a set of key-value pairs. You can access a value using the associated key.
- A dictionary can contain any type of objects including a dictionary.
- Keys have to be immutable objects, i.e., floats, integers, strings, booleans, etc.
    - For example, a list cannot be a key.
- ***Important***: There is no concept of the order or the index for dictionaries!
    - Recall: You are able to access items in a list using indices. You cannot do the same thing for dictionaries. There is no index for dictionaries.
    
[back to top](#top)

**a. Making a dictionary**<a id='sec4_3_1'></a>
- You can make a dictionary in the following way.
- Check the types of these dictionaries. Any difference?

[back to top](#top)

In [None]:
dict1 = {'tom':1.75, 'jerry':1.82}
dict2 = {'tom':{'height':1.75, 'weight':80.0}, 'jerry':{'height':1.82, 'weight':85.0}}

**b. Accessing items**<a id='sec4_3_2'></a>
- To access an item in a dictionary, use `mydict[key]`.
- Access the value of `'tom'` (i.e., `1.75`) in `dict1`.
- Access the weight of `'jerry'` in `dict2`. Hint: Use `mydict[key1][key2]`.

[back to top](#top)

In [None]:
dict1 = {'tom':1.75, 'jerry':1.82}
dict2 = {'tom':{'height':1.75, 'weight':80.0}, 'jerry':{'height':1.82, 'weight':85.0}}

**c. Checking keys**<a id='sec4_3_3'></a>
- You can access all the keys of a dictionary using `mydict.keys()`.
- Use `key1 in mydict` to check if `mydict` contains `key1`.
- Check all the keys of `dict2`.
- Check if `'height'` is included as a key in `dict2` and `dict2['tom']`, respectively.

[back to top](#top)

In [None]:
dict2 = {'tom':{'height':1.75, 'weight':80.0}, 'jerry':{'height':1.82, 'weight':85.0}}

- ***Important***: If the key is not unique, the right most value of those keys shows up.
    - Of course, this should not happen in your data! The key should be unique as always.
- Run the following code.

In [None]:
dict1 = {'tom':1.75, 'jerry':1.82, 'tom':1.95}
dict1['tom']

**d. Checking values**<a id='sec4_3_4'></a>
- To check all the values of a dictionary, use `mydict.values()`.
- Check if `dict1` contains `1.75`. What about it in `dict2`?

[back to top](#top)

In [None]:
dict1 = {'tom':1.75, 'jerry':1.82}
dict2 = {'tom':{'height':1.75, 'weight':80.0}, 'jerry':{'height':1.82, 'weight':85.0}}

**e. Changing and adding items**<a id='sec4_3_5'></a>
- Similar to the list, dictionaries are also mutable objects.
- You can add a key-value pair to a dictionary using `mydict[key]=value` or `mydict.update({key1:value1, key2:value2, ...})`.
- You can also change values in the same way. 
- Add a `'spike':1.65` pair to `dict1`.
- Change `'spike'`'s height to `1.58`.

[back to top](#top)

In [None]:
dict1 = {'tom':1.75, 'jerry':1.82}

**f. Changing keys**<a id='sec4_3_6'></a>
- To change a key to another, you can write like `mydict[new_key] = mydict.pop(old_key)`.
- Change `'tom'` in `dict1` to `'spike'`. Print it.

[back to top](#top)

In [None]:
dict1 = {'tom':1.75, 'jerry':1.82}

**g. Deleting items**<a id='sec4_3_7'></a>
- To delete a key-value pair, you can use `del mydict[key]`.
- Delete `'tom'` and its associated value from `dict1`.

[back to top](#top)

In [None]:
dict1 = {'tom':1.75, 'jerry':1.82}

Exercise 5:
1. Make a dictionary of `[165, 58]` and `180` whose keys are `'tom'` and `'jerry'`, respectively. Name it `dict1`.
2. Replace `180` with `[180, 60]`. Print it.

### D. Tuples<a id='sec4_4'></a>
- A tuple is made of parentheses `()`.
- Similar to the list and the dictionary, tuples can contain a tuple.

[back to top](#top)

**a. Making a tuple**<a id='sec4_4_1'></a>
- You can make a tuple in the following way.
- Check the types of these tuples. Any difference?

[back to top](#top)

In [None]:
tuple1 = (1.0, 2, ["tom", "jerry"])
tuple2 = ((1.0, 2), (["tom", "jerry"], True))

**b. Accessing and checking items**<a id='sec4_4_2'></a>
- Similar to the list:
    - To access an item in a tuple, use `mytuple[index]`.
    - You can use relational operators and membership operators to check items.
- Check if the first and the third items in `tuple1` are the same using a relational operator.

[back to top](#top)

In [None]:
tuple1 = (1.0, 2, ["tom", "jerry"])

**c. Changing, adding, and deleting items**<a id='sec4_4_3'></a>
- A tuple is similar to a list except that ***you cannot change, add, or delete the items***.
    - Recall: Tuples are immutable objects, while lists are mutable objects.
    - An exception: Add another tuple using `+`.
- Try changing `2` to `4` in `tuple1`.

[back to top](#top)

In [None]:
tuple1 = (1.0, 2, ["tom", "jerry"])

Exercise 6: Change `2` to `4` in `tuple1`. (Hint: Convert the type before changing them.)

In [None]:
tuple1 = (1.0, 2, ["tom", "jerry"])

## 5. Loops and If Statements<a id='sec5'></a>
- Loops allow you to automate some of your processes.
- There are `for` loops and `while` loops.
    - You often use `range()` in `for` loops.
    
[back to top](#top)

- Use colons and indents to make a loop.
    - To insert an indent, press the TAB key.
- You can make a loop in the following way:
```python
for x in range(1,6,2): # use a colon
    print("Current number is ", x) # insert an indent
```
- Make a loop that prints integers from 0 to 10.

- You can also make a loop over a string, a list, a dictionary, and a tuple.
- Run the following code.

In [None]:
print("--- loop over a string ---")
str1 = "abcde"
for x in str1:
    print(x)

print("--- loop over a list ---")
list1 = ["tom", 1.75, "jerry", 1.82]
for x in list1:
    print(x)

print("--- loop over a list of lists (a nested loop) ---")
list_of_lists1 = [["tom", 1.75], ["jerry", 1.82]]
for list1 in list_of_lists1:
    for x in list1:
        print(x) 

print("--- loop over a dictionary ---")        
dict1= {'tom':1.75, 'jerry':1.82}
for x in dict1:
    print(x)
    
print("--- loop over a tuple ---")        
tuple1 = (1, 2, 3)
for x in tuple1:
    print(x)

- You may often append items to a list using a loop.

In [None]:
list1 = [] # make an empty list
for i in range(0, 10):
    j = 1 + i
    list1.append(j)
print(list1)

- You can do a specific operation for a subset of the entire loop by using if statements.
- To make if statement clauses, you can use `if`, `elif` (means "else if"), and `else`.
    - You can only use `if` without using `elif` or `else`.
    - Similarly, you can use `if` and `else` without using `elif`, and `if` and `elif` without using `else`.
- It starts from the top clause, and then go down to the next clause, and so on.    
- Use colons and indents for if statement clauses.

In [None]:
list1 = [-1, 1, 3, 10]
for x in list1: # loop over list1
    if x < 0:
        print("if")
    elif x >= 0 and x < 5:
        print("elif")        
    else:
        print("else")

- To make if conditions, you can use relational operators (`==`, `!=`, `>`, `<`, `>=`, `<=`), **logical operators** (`and`, `or`, `not`), identity operators (`is`, `is not`), and membership operators (`in`, `not in`).

In [None]:
list1 = ["tom", 1.75, 3, None]
for x in list1:
    if x is None:
        print("None!")
    elif type(x) == float or type(x) == int:
        if x >= 0 and x < 2:
            print("0 <= x < 2")
        else:
            print("x < 0 or x >= 2") 
    else:
        print("neither None, nor float, nor int")

- ***Important***: The difference between `==` and `is` is that `==` tests for logical equality, while `is` tests for object identity.
    - For `None`, you should use `is`. That's the rule.
    <!-- see, https://www.python.org/dev/peps/pep-0008/#programming-recommendations -->
- Run the following code. Can you see the difference?

In [None]:
print(id(True), id(1))
print(True is 1, True == 1)

x, y = 1, 1 # immutable
print(id(x), id(y))
print(x is y, x == y)

x, y = [1,2], [1,2] # mutable
print(id(x), id(y))
print(x is y, x == y)

x = [1,2]
y = x
print(id(x), id(y))
print(x is y, x == y)

- Similarly, `&` and `|` perform bitwise operation, while `and` and `or` do logical operation.
    - See e.g. [this page](https://en.wikipedia.org/wiki/Bitwise_operation).
- Run the following code. Can you see the difference?

In [None]:
print(2 > 1 and 4 > 3)
print(3 & 5) # decimal 3 is 0011 while decimal 5 is 0101 in 4-bit. intersection is 0001, which is decimal 1

- A loop can be used inside a list (to make a list). This method is called a **list comprehension**.

In [None]:
list1 = [item for item in range(10)]
print(list1)
list2 = [x**2 for x in range(10) if x < 5]
print(list2)

- A `while` loop continues the process until the condition is met.
    - For infinite loops, use `while True`.
- You may often use counters and **assignment operators** (e.g., `+=`, `-=`) in a `while` loop.
- You can make a while loop in the following way.

In [None]:
i = 0 # set the initial number
while i < 10:
    i += 1 # add 1 to i. same as i = i + 1
    print(i)

- You may often use `break` and `continue` in loops.
    - `break` means that you will exit from the current loop.
    - `continue` means that you will go up to the start of the current loop.

In [None]:
k = 0
while True: # infinite loop
    k += 1
    if k == 5:
        continue # if k == 5, go up to the start of the current loop
    elif k > 10:
        break # if k > 10, exit from the current loop
    print(k)

- If you want to skip an error, use `try` and `except`.
- Run the following code. You should get an error. Why?

In [None]:
list1 = []
for i in [1, 2, 0.5, "abc"]:
    list1.append(float(i))

- How about this?
    - You can also check all built-in errors [here](https://docs.python.org/3/library/exceptions.html#bltin-exceptions).

In [None]:
list1 = []
for i in [1, 2, 0.5, "abc"]:
    try:
        list1.append(float(i))
    except:
        list1.append(float(-999))
        print("Error in item =", i)
print(list1)

Exercise 7:
1. Make a loop that prints all the keys of `dict1` one-by-one.
2. Replace `-50` to `50` if the key is `'bb'` inside the loop.
3. Print `dict1` (outside the loop).

In [None]:
dict1 = {'a':{'aa':20, 'ab':40}, 'b':{'ba':30, 'bb':-50}}

## 6. Functions<a id='sec6'></a>
- A function returns outputs given inputs. It also accepts options, if available.
- Useful built-in functions include: `print()`, `float()`, `int()`, `str()`, `len()`, `max()`, `min()`, `round()`, `sorted()`, etc. 
    - (Strictly speaking, operators like `float()`, `int()`, and `str()` are not functions. They are called class constructors.)
- Some functions can be used only for specific types.

[back to top](#top)

- Compute the lengths of `list1`. Hint: `len()`.
- Can you also apply the same function for integer `x`? Try it.

In [None]:
list1 = ["a", "b"]
x = 2

- To see a help file, type `?` before or after a function, or use `help()`.
- Try `?round`, `round?`, and `help(round)`.

- It returns `round(number, ndigits=None)` and an explanation about the function. 
    - This means that the input is a number and the output is a rounded number. 
    - `ndigits` is an option by which you can specify a precision in decimal digits.
- For example, `round(1.85,1)` returns the closest float with one decimal digit, i.e., 1.9. 
- What about `round(1.84,1)`? Try it.

- You can check the source code using `??`, if available.
    - Or use `getsource` in the `inspect` package.
    - However, both methods do not work for built-in functions.
- Check the source code of `mysq`. Also check the source code of `len`, which is a built-in function.

In [None]:
def mysq(x):
    return x ** 2

- You can make a function by yourself using `def`.
- Print `mysq(2)` after the following code.

In [None]:
def mysq(x):
    print("compute the square of", x)
    return x ** 2

- You should always write `return` at the end of the `def` environment to show the output. 
- What happens if you do not include it? Print `mysq(2)` again using the following example.

In [None]:
def mysq(x):
    print("compute the square of", x)
    x ** 2

- Inputs and outputs can be more than one.
- Print `mysum(1, 2)`.
- How can we access each outcome? Recall how you can access items in a tuple.

In [None]:
def mysum(x, y): # two inputs
    print("compute the sum of", x, "and", y)
    return (x, y, x + y) # three outputs in a tuple

- Alternatively, you can make a function using a **lambda function**.
- Print `mysum(1, 2)`.

In [None]:
mysum = lambda x, y: (x, y, x + y)

Exercise 8:
1. Make a function that prints "Two items are equal." if inputs of a list of length two are the same, and "Two items are not equal." otherwise.
2. Test it using `list1`.

In [None]:
list1 = [1, 1.0]

## 7. Methods<a id='sec7'></a>
- A method is like a function attached to an object. 
- You write like `object.method()` and it works like a function.
- Useful methods include: `index()`, `count()`, `append()`, `remove()`, `reverse()`, `sort()`, `capitalize()`, `upper()`, and `replace()`.
- Some methods can be used only for specific types.

[back to top](#top)

- Find the index of `"tom"` in `list1`. Hint: `object.index(item)`.
- Reverse the order of `list1`. Hint: `object.reverse()`.
- Sort `list2`. Hint: `object.sort()`. Can you also sort `list1` too?

In [None]:
list1 = ["tom", 1.75, "jerry", 1.82]
list2 = [1, 5, 4, 3, 2]

- You can check all the available methods using `dir(object)`.
- Check the available methods for `list1`.

In [None]:
list1 = ["tom", 1.75, "jerry", 1.82]

- Alternatively, press the TAB key after writing `object.`.
- Check the available methods for `"tom"`.

In [None]:
"tom".

- To see a help file, write `help(object.method)`.
    - `?` does not always work.
- Try `help("tom".count)`. What about `?"tom".count`?

## 8. Modules and Packages<a id='sec8'></a>
- You can import useful modules and packages using `import`. 
    - A module can contain functions, classes, etc.
    - A package can contain submodules and subpackages.
    - There are 2,076,292 releases (as of 2020/09/16) in [PyPI](https://pypi.org/).
<!--  2,134,644 packages (as of 2019/09/06), 1,274,332 packages (as of 2018/04/29) -->
- A `,` operator can be used to import multiple modules and packages.
- If you get an error while importing modules/packages, you should install them using Command Prompt/Terminal. (See Lec 1 slides.)

[back to top](#top)

In [None]:
import math, numpy # math is a module and NumPy is a package.
print(math.radians(45)) # convert degrees to radians
print(numpy.radians(45))

- Use `import module/package as myname` to define the name of the module/package as `myname`.
- Tip: Use a simple and intuitive name.

In [None]:
import math as m, numpy as np
print(m.radians(45))
print(np.radians(45))

- Alternatively, use
    - `from module/package import function` to import a specific function.
    - `from package import module` to import a specific module.
    - `from module/package import *` to import all entries from a module/package.

- Tip: The last option with a `*` sign is not always recommended. Try:
```python
from math import *
from numpy import *
deg = arange(12.) * 30.
print(radians(deg))
```
vs.
```python
from numpy import *
from math import *
deg = arange(12.) * 30.
print(radians(deg))
```
- You will get an error in the second case because `radians` in that case is `math.radians`, which does not accept a NumPy array as an input.

- You can make a list of all imported modules using `modules.keys()` in the `sys` package.

In [None]:
import sys
sys.modules.keys()

- To check all the entries in a specific package/module, use `dir()`.
    - To understand how `dir()` behaves differently with different types of objects, see [this page](https://docs.python.org/3/library/functions.html#dir).
- Alternatively, press the TAB key after writing `module.` or `package.`.
- Try both for the `math` module.

In [None]:
import math

- To get the documentation of a package, use `?`.
    - `help()` does not always work.
- Get the documentation of NumPy.

In [None]:
import numpy as np

## 9. NumPy<a id='sec9'></a>
- NumPy (numeric python) is one of the foundamental packages for numerical computation and data manipulation in Python.
- Run the following code to import NumPy.

[back to top](#top)

In [None]:
import numpy as np

### A. Vectorized computation and ufuncs<a id='sec9_1'></a>
- A nice thing about NumPy is that you can use a NumPy array, which is the multidimensional container of items.
- For example, NumPy arrays allow item-by-item calculation, or **vectorized computation**.
    - By contrast, lists do not allow such calculation.
- Run the following code. What about `np_weight / np_height ** 2`? 

[back to top](#top)

In [None]:
height = [1.80, 1.76, 1.64]
weight = [80, 75, 60]
weight / height ** 2
np_height = np.array(height) # convert a list to a NumPy array
np_weight = np.array(weight)

- More generally, NumPy has **ufuncs** (universal functions) that allow item-by-item calculation.
- You write like `np.abs(myarray)`, which converts ***all items*** in `myarray` to absolute values.
    - Other useful ufuncs include `log()` and `round()`.
- Convert all items in `array1` to absolute values.

In [None]:
array1 = np.array([-1, 2, -5, 1])

### B. Making an array<a id='sec9_2'></a>
- You can make a NumPy array in the following way.

[back to top](#top)

In [None]:
array1 = np.array([1, 2, 3]) # start from a list [1, 2, 3]. np.array converts it to a NumPy array

- There are some clever ways to make an array.

In [None]:
array1 = np.array(list(range(10))) # make a list using list() and range(). then convert it to an array
array2 = np.zeros(10) # make an array of 10 zeros
array3 = np.ones(10) # make an array of 10 ones
array4 = np.full((2,3),1.2) # make a 2 x 3 array of 1.2
array5 = np.arange(0,10,2) # make numbers from 0 to 10, incremented by 2
array6 = np.linspace(0,1,5) # split [0, 1] into 5 evenly spaced samples
array7 = np.random.randn(2,2) # make a 2 x 2 array of random values drawn from the standard normal distribution

### C. Type of items in an array<a id='sec9_3'></a>
- NumPy has more types than Python's built-in types such as `int8` and `float64`.
    - See e.g. [this page](https://numpy.org/devdocs/user/basics.types.html).
- For a NumPy array, you can explicity set its type, if possible.
- Check the type of the third item of `array1`.
- Change the third item of `array1` to `"abc"` and execute it again. You should get an error. Why?

[back to top](#top)

In [None]:
array1 = np.array([1, 1.0, "1", True], dtype='float64')

- An array can contain only one type. ***If there are multiple types, they are converted to a single type!***
    - The order of the types: strings > floats > integers > booleans.
- Check the type of the first item in each array. Hint: Use `myarray[index]`.

In [None]:
array1 = np.array([1, 1.0, "1", True])
array2 = np.array([1, 1.0, True])
array3 = np.array([1, True])

### D. Accessing items<a id='sec9_4'></a>
- To access items in an array, use `myarray[index]`, `myarray[index1][index2]`, etc.
    - You can also write like `myarray[index1, index2]`, which is perhaps more common.
- You can use slicing as well.
    - Use `myarray[:, index]` to select all rows of a specific column and `myarray[index, :]` to select all columns of a specific row.
- Access the first row and the second column of `array1`.
- Access all the columns of the first row of `array1`.

[back to top](#top)

In [None]:
array1 = np.array([[1, 2, 3], [4, 5, 6]])

**a. Masking**<a id='sec9_4_1'></a>
- NumPy arrays allow other ways to select items.
- First, to access only items in an array that satisfy a condition (e.g., `myarray > 0`), you can use a **masking** operation.
    - The condition is called a **mask**.
- You can use relational operators like `==`, `>`, and `!=` for masking.
- For example, when you write `myarray > 0`, this returns a boolean array where `True` means that the associated item satisfies the condition `> 0`.
- For multiple Boolean evaluations of arrays, you can use `&` and `|`, but cannot use `and` and `or`.
    - The former operators (like ufuncs) perform multiple Boolean evaluations, while the latter operators only do a single Boolean evaluation.
    - See e.g. [this page](https://jakevdp.github.io/PythonDataScienceHandbook/02.06-boolean-arrays-and-masks.html).
    
[back to top](#top)

- Run the following code.

In [None]:
array1 = np.array([[1, -1, -5], [2, -4, -3]])
print(array1)
print(array1 > 0)
print((array1 > -3) & (array1 < 2)) # do not forget parentheses!
#print((array1 > -3) and (array1 < 2)) # this returns an error

- If you apply a mask to an array, you can access only items that satisfy the condition.

In [None]:
array1 = np.array([[1, -1, -5], [2, -4, -3]])
print(array1[(array1 > 0)])
print(array1[(array1 > -3) & (array1 < 2)])

- Masking can be cleverly used for computation.

In [None]:
array1 = np.array([[1, -1, -5], [2, -4, -3]])
print(np.sum(array1 > 0)) # count the number of positive items 
print(np.any(array1 > 0)) # check whether any item in array1 is positive
print(np.all(array1 > 0)) # check whether all items in array1 are positive

**b. Fancy Indexing**<a id='sec9_4_2'></a>
- Second, you can apply a list or an array of indices to access items in an array.
- This style of selection is called **fancy indexing**.
    - Fancy indexing is different from masking in the sense that it is a selection using indices.
- You write like `myarray[list/array]`.
- Apply `index_list` and `index_array` to `array1`, respectively.

[back to top](#top)

In [None]:
array1 = np.array(['a', 'b', 'c', 'd'])
print(array1)
index_list = [0, 3] # a list of indices
index_array = np.array([[1, 3], [1, 2]]) # an array of indices

### E. Checking dimension, shape, size, and length of an array<a id='sec9_5'></a>
- You can check the dimension of an array using `np.ndim(myarray)`, the shape of an array using `np.shape(myarray)`, the size (= the number of items) of an array using `np.size(myarray)`, and the length of an array using `len(myarray)`.
    - You can also use `ndim`, `shape`, and `size` as a method like `myarray.ndim`, `myarray.shape`, and `myarray.size`. (They are called attributes to instances.)
- Print `array1` and `array2`. Check their dimension, shape, and size.
- A two dimensional NumPy array is also called a 2D NumPy array.

[back to top](#top)

In [None]:
array1 = np.array([[1, 2 ,3], [4, 5, 6]])
array2 = np.array([range(i,i + 3) for i in [2, 4, 6]])

### F. Changing arrays<a id='sec9_6'></a>
**a. Changing items**<a id='sec9_6_1'></a>
- Similar to the list, you can change an item of an array using `myarray[index] = new_item`.
- Change all `-1`'s to `10` in `array1` using masking. Then print it.

[back to top](#top)

In [None]:
array1 = np.array([[1, 2, -1], [4, -1, 6]])

**b. Adding items**<a id='sec9_6_2'></a>
- You can use `np.append()` to add items to an array.
    - You write like `np.append(myarray1, [new items], axis=0)`, where `axis` can be `1` (add a column) or `0` (add a row).
    - You need to write the correct structure of `[new items]`.
- Print `array1`, `array2`, and `array3`.

[back to top](#top)

In [None]:
array1 = np.array([[1, -1, -5], [2, -4, -3]])
array2 = np.append(array1, [[3, 5, 6]], axis=0) # add a row
#array2 = np.append(array1, [3, 5, 6], axis=0) # this returns an error
array3 = np.append(array1, [[3], [6]], axis=1) # add a column
#array3 = np.append(array1, [3, 6], axis=1) # this returns an error

**c. Concatenating arrays**<a id='sec9_6_3'></a>
- Alternatively, to combine multiple arrays, you can use `np.concatenate()`.
    - You write like `np.concatenate([myarray1, myarray2], axis=0)`, where `axis` can be `1` (add a column) or `0` (add a row).
    - Other functions: `hstack()` and `vstack()`.
- Run the following code. Next, use `np.array()` instead of `np.concatenate()`. Can you see the difference?

[back to top](#top)

In [None]:
array1 = np.array([[1, 2], [3, 4]])
array2 = np.array([[5, 6], [7, 8]])
array3 = np.concatenate([array1, array2], axis=0)

**d. Deleting items**<a id='sec9_6_4'></a>
- Deleting items from an array is like taking a subset of it.
- Alternatively, use `np.delete()`.
    - You write like `np.delete(myarray, index)`.
    - For 2D arrays, write like `np.delete(myarray, index, axis=1)` for deleting columns and `np.delete(myarray, index, axis=0)` for deleting rows.
- Delete `3` from `array1`.
- Instead, delete the second column of `array1`.

[back to top](#top)

In [134]:
array1 = np.array([[1, 2, 3], [4, 5, 6]])

**e. Reshaping an array**<a id='sec9_6_5'></a>
- You can change the shape of an array (e.g., change a 3 x 1 array to a 1 x 3 array).
- To reshape an array, use `object.reshape(#row, #column)`.
    - Not `np.reshape()`.
- Change `array1` to a 3 x 2 array.

[back to top](#top)

In [None]:
array1 = np.array([[1, 2, 3], [4, 5, 6]])

### G. Summarizing items<a id='sec9_7'></a>
- You can easily compute statistics, e.g., min, max, mean, and sum of items in an array.
- Useful functions for summarizing items include: `min()`, `max()`, `mean()`, `median()`, `std()`, `sum()`, and `corrcoef()`.
- One writes like `np.min(myarray)`.
    - One can also write like `myarray.min()` for some operations.
- Access the smallest number in `array1`.
- Next, access the smallest number of column 2 in `array1`.
- Finally, use the `axis=0` or `axis=1` option in the above example. What did you get?

[back to top](#top)

In [None]:
array1 = np.array([[1, 2, 3], [-1, -2, -3]])

## 10. Pandas<a id='sec10'></a>
- Pandas is a very useful Python package made for handling data.
- Since it's built on NumPy, there are some similarities.

[back to top](#top)

In [None]:
import pandas as pd

- Pandas has `Series`, `DataFrame`, and `Index`.
    - `Series`: A **one-dimensional array** of indexed data
    - `DataFrame`: A **two-dimensional array** with indices (for rows) and column names (for columns)
    - `Index`: An index object itself

### A. Series<a id='sec10_1'></a>
- A Pandas series is like a generalized one-dimensional NumPy array.
    - It is generalized in the sense that (a) it preserve types and (b) it can have user-defined indices.
- What is the type of the third item in `series1`?

[back to top](#top)

In [None]:
series1 = pd.Series([1.0, 2, "3"], index=['a', 'b', 'c']) # you can define your own indices
print(series1)
print(series1.values) # print values
print(series1.index) # print indices

#### a. Making a series<a id='sec10_1_1'></a>
- You can make a series from lists, dictionaries, NumPy arrays, etc.
- Print `series1` and `series2`.

[back to top](#top)

In [28]:
list1 = [1.75, 1.82] # make a series from a list
series1 = pd.Series(list1, index=['tom', 'jerry'])
dict1 = {'tom':1.75, 'jerry':1.82} # make a series from a dictionary
series2 = pd.Series(dict1)

#### b. Accessing items<a id='sec10_1_2'></a>
- Since a Pandas series is like a NumPy array, you can use masking and fancy indexing to access items.
- Moreover, you can use user-defined indices to access items, rather than built-in indices.

[back to top](#top)

In [None]:
dict1 = {'tom':1.75, 'jerry':1.82, 'spike':1.65}
series1 = pd.Series(dict1)
print(series1[series1 > 1.8]) # masking
print(series1[0:2]) # slicing (using built-in indices)
print(series1['tom':'jerry']) # slicing (using user-defined indices)
print(series1[[0, 2]]) # fancy indexing (using built-in indices) 
print(series1[['tom', 'spike']]) # fancy indexing (using user-defined indices)

### B. DataFrame<a id='sec10_2'></a>
- A Pandas dataframe is like a generalized two-dimensional NumPy array.
    - It is generalized in the sense that (a) it preserve types and (b) it can have user-defined indices.
- Run the following code. Also, check the type of spike's weight.

[back to top](#top)

In [None]:
dict1, dict2 = {'tom':1.75, 'jerry':1.82, 'spike':1.65}, {'tom':65, 'jerry':72, 'spike':"58"}
df1 = pd.DataFrame({'height':dict1, 'weight':dict2}) # height and weight are column names
print(df1)
print(df1.values) # print values
print(df1.index) # print indices
print(df1.columns) # print column names

#### a. Making a dataframe<a id='sec10_2_1'></a>
- You can make a dataframe from lists, dictionaries, NumPy arrays, Pandas series, etc.
- Print `df1` and `df2`.

[back to top](#top)

In [None]:
list1, list2 = [1.75, 65], [1.82, 72] # make a dataframe from lists
df1 = pd.DataFrame([list1, list2], index=['tom', 'jerry'], columns=['height', 'weight']) 
dict1, dict2 = {'tom':1.75, 'jerry':1.82}, {'tom':65, 'jerry':72} # make a dataframe from dictionaries
df2 = pd.DataFrame({'height':dict1, 'weight':dict2}) 

#### b. Accessing items<a id='sec10_2_2'></a>
- Since a Pandas dataframe is like a NumPy array, you can use masking and fancy indexing.
- Moreover, you can use user-defined indices and column names to access items.
- However, slicing and fancy indexing do not always work! Run the following code to check it.

[back to top](#top)

In [None]:
df1 = pd.DataFrame([[1.75, 65], [1.82, 74], [1.65, 58]], columns=['height', 'weight'], index=['tom', 'jerry', 'spike'])
print(df1)
print(df1[df1 > 50]) # masking
print(df1['tom':'jerry']) # slicing using (user-defined) indices. this works
#print(df1['height':'weight']) # slicing using (user-defined) column names. this returns an error
#print(df1[['tom', 'spike']]) # fancy indexing using (user-defined) indices. this returns an error
print(df1[['height', 'weight']]) # fancy indexing using (user-defined) column names. this works

- This is inconvenient. How can we solve the issue?
- For dataframes, it is more convenient to use a method `mydf.iloc` (for built-in indices) or `mydf.loc` (for user-defined indices/column names).
    - `iloc` and `loc` are called **indexers**.
- Run the following code.

In [None]:
df1 = pd.DataFrame([[1.75, 65], [1.82, 74], [1.65, 58]], columns=['height', 'weight'], index=['tom', 'jerry', 'spike'])
print(df1.iloc[1, 1]) # access row 2 column 2 item (using built-in indices and column names)
print(df1.loc['jerry', 'weight']) # access row 2 column 2 item (using user-defined indices and column names)
print(df1.loc[['tom','spike'], 'height':'weight']) # applying slicing and fancy indexing
print(df1.loc[df1['weight'] > 60, ['weight','height']]) # applying masking and fancy indexing

- Alternatively, if you want to access items only from one column, you can use `mydf['column']` or `mydf.column` to make a Pandas series before accessing items.

In [None]:
df1 = pd.DataFrame([[1.75, 65], [1.82, 74], [1.65, 58]], columns=['height', 'weight'], index=['tom', 'jerry', 'spike'])
print(df1.height) # or print(df1['height'])
print(type(df1.height)) # type is a Pandas series
print(df1.height[0:2]) # slicing
print(df1.height[[0, 2]]) # fancy indexing

- Other useful ways to extract items are `mydf.head()` and `mydf.tail()`.

In [None]:
df1 = pd.DataFrame([[1.75, 65], [1.82, 74], [1.65, 58]], columns=['height', 'weight'], index=['tom', 'jerry', 'spike'])
print(df1.head(2)) # access first two row
print(df1.tail(2)) # access last two row

#### c. Checking dimension, shape, size, and length of a dataframe<a id='sec10_2_3'></a>
- Similar to NumPy, you can use `mydf.ndim`, `mydf.shape`, `mydf.size`, and `len()` for dataframes.
    - To check the number of rows, use `len(mydf)` or `len(mydf.index)`. For columns, use `len(mydf.columns)`.
- You can also use `mydf.info()` to get more detailed information.
- Check the dimension, shape, size, and length of `df1`.
- Also, check the detailed information of `df1` using `info()`.

[back to top](#top)

In [None]:
df1 = pd.DataFrame([[1.75, 65], [1.82, 74], [1.65, 58]], columns=['height', 'weight'], index=['tom', 'jerry', 'spike'])

#### d. Changing and adding items<a id='sec10_2_4'></a>
- Similar to the list and the array, you can change the items in a dataframe and add a column and/or a row.
    - To add a column, you can use `mydf.loc[:, 'column'] = new_item` (or `mydf.iloc[:, column] = new_item`). 
        - Alternatively, you can use `mydf['column'] = new_item`.
    - To add a row, you can use `mydf.loc['index', :] = new_item` (or `mydf.iloc[index, :] = new_item`). 
        - `mydf.iloc[]` does not always work.

[back to top](#top)

- Change the weight of jerry to `72` in `df1`. Print it.
- Then, add tom and jerry's age using `[58, 75]`. The column name should be `'age'`. Print it.
- Then, add spike's height, weight, and age using `[1.65, 58, 62]`. Print it.

In [None]:
df1 = pd.DataFrame([[1.75, 65], [1.82, 74]], columns=['height', 'weight'], index=['tom', 'jerry'])

#### e. Concatenating dataframes<a id='sec10_2_5'></a>
- Similar to NumPy's `np.concatenate()`, you can concatenate Pandas dataframes and series using `pd.concat()`.
    - You write like `pd.concat([mydf1, mydf2], axis=0)`, where `axis` can be `1` (add a column) or `0` (add a row).
    - (Alternatively, to add a row, you can also write like `mydf1.append(mydf2)`. However, Pandas' `mydf.append` is different from NumPy's `np.append` as it does not allow you to add a column. Pandas' `mydf.append` is more like `mylist.append` for lists.)

[back to top](#top)

- Concatenate `df1` and `df2` vertically. Then, cocatenate `df1` and `df3` horizontally.

In [None]:
df1 = pd.DataFrame([[1.75, 65], [1.82, 74]], columns=['height', 'weight'], index=['tom', 'jerry'])
df2 = pd.DataFrame({'height':1.67, 'weight':58}, index=['spike'])
df3 = pd.DataFrame({'tom':75, 'jerry':60}, index=['age']).T # df.T means the transpose of df
print(df1); print(df2); print(df3)

#### f. Deleting items<a id='sec10_2_6'></a>
- Deleting items from a dataframe is like taking a subset.
- Alternatively, you can use `mydf.drop()`.
    - You write like `mydf.drop('column', axis=1)` for deleting a column and `mydf.drop('index', axis=0)` for deleting a row.
    - You need to define a new object to reflect the change.
- Delete the `'height'` column in `df1`.
- Instead, delete the `'tom'` row in `df1`.

[back to top](#top)

In [None]:
df1 = pd.DataFrame([[1.75, 65], [1.82, 74]], columns=['height', 'weight'], index=['tom', 'jerry'])

#### g. Changing column names/indices<a id='sec10_2_7'></a>
- To rename column names/indices, write `mydf.rename(columns={'old1':'new1', ...}, index={'old1':'new1', ...})`.
    - You need to define a new object to reflect the change.
- Change the column name from `'hheight'` to `'height'` and the index from `'ttom'` to `'tom'` using `df1`.

[back to top](#top)

In [None]:
df1 = pd.DataFrame([[1.75, 65], [1.82, 74]], columns=['hheight', 'weight'], index=['ttom', 'jerry'])

#### h. Sorting items<a id='sec10_2_8'></a>
- To sort a dataframe by a row or column, use `mydf.sort_values(by='column name', axis=0)` or `mydf.sort_values(by='index', axis=1)`.
    - Available option: `ascending`, etc.
    - You can use more than one column name or index (e.g., `by=['column name1', 'column name2']`).
- You can also use `mydf.sort_index` to sort by index.
    - Available option: `ascending`, etc.
- You need to define a new object to reflect the change.

[back to top](#top)

- Sort the `'height'` column in `df1` in the descending order. Define `df2`. Hint: Use the `ascending=False` option.
- Then, sort `df2` in the reversed alphabetical order (from z to a) using the indices.

In [None]:
df1 = pd.DataFrame([[1.75, 65], [1.82, 74]], columns=['height', 'weight'], index=['tom', 'jerry'])

#### i. Transposing a dataframe<a id='sec10_2_9'></a>
- To transpose a dataframe, use `mydf.T`.
- Transpose `df1`.

[back to top](#top)

In [None]:
df1 = pd.DataFrame([[1.75, 65], [1.82, 74]], columns=['height', 'weight'], index=['tom', 'jerry'])

#### j. Summarizing items<a id='sec10_2_10'></a>
- Similar to NumPy, you can summarize items using `min`, `max`, `mean`, `std`, `sum`, etc.
    - You write like `mydf.sum()`.
    - You can also specify columns or rows using the `axis` option, where `axis=1` means that you apply a method to columns and `axis=0` means that you apply a method to rows.
- Compute `df1.sum()`. Then `df1.sum(axis=1)`. What is the difference?

[back to top](#top)

In [None]:
df1 = pd.DataFrame([[1.75, 65], [1.82, 74], [1.65, 58]], columns=['height', 'weight'], index=['tom', 'jerry', 'spike'])

- You can also use `mydf.describe()` to get a summary table.
- Get the summary table of `df1`.

In [None]:
df1 = pd.DataFrame([[1.75, 65], [1.82, 74], [1.65, 58]], columns=['height', 'weight'], index=['tom', 'jerry', 'spike'])