# Introduction to Python


## Basic Data Types

| English name          | Type name  | Type Category  | Description                                   | Example                                   |
| :-------------------- | :--------- | :------------- | :-------------------------------------------- | :---------------------------------------- |
| integer               | `int`      | Numeric Type   | positive/negative whole numbers               | `42`                                      |
| floating point number | `float`    | Numeric Type   | real number in decimal form                   | `3.14159`                                 |
| boolean               | `bool`     | Boolean Values | true or false                                 | `True`                                    |
| string                | `str`      | Sequence Type  | text                                          | `"Can I have a cheezburger?"`             |
| list                  | `list`     | Sequence Type  | a collection of objects - mutable & ordered   | `['Ali', 'Xinyi', 'Miriam']`              |
| tuple                 | `tuple`    | Sequence Type  | a collection of objects - immutable & ordered | `('Thursday', 6, 9, 2023)`                |
| dictionary            | `dict`     | Mapping Type   | mapping of key-value pairs                    | `{'name':'IE', 'code':6600, 'credits':2}` |
| none                  | `NoneType` | Null Object    | represents no value                           | `None`                                    |


- **value** is the data which can be a number or text
- **variable** is the name reference of the value
- **type** of the variable is kind of data stored in a variavle, e.g. `37` is an integer and `Visualization` is a string


### 1. Numeric Data Types


In [5]:
x = 23

In [6]:
print(x)

23


In [7]:
x

23

In [8]:
type(x)

int

In [9]:
exp = 2.718

In [10]:
exp

2.718

In [11]:
type(exp)

float

### 2. Arithmatic Operators


| Operator |            Description            |
| :------: | :-------------------------------: |
|   `+`    |             addition              |
|   `-`    |            subtraction            |
|   `*`    |          multiplication           |
|   `/`    |             division              |
|   `**`   |          exponentiation           |
|   `//`   | integer division / floor division |
|   `%`    |              modulo               |


Default order of operations:

1. Parentheses
1. Exponentiation
1. Multiplication
1. Division
1. Addition
1. Subtraction


In [12]:
1 + 4 + 10 + 66

81

In [13]:
1.0 + 1.0

2.0

In [14]:
2/3

0.6666666666666666

In [15]:
5*exp

13.59

In [16]:
8 % 3

2

In [17]:
5 // 3

1

In [18]:
-5 // 3

-2

In [19]:
2 ** 5

32

### 3. `None`


In [20]:
x = None

In [21]:
print(x)

None


In [22]:
type(x)

NoneType

### 4. Strings


In [23]:
name = 'qurat'

In [24]:
name

'qurat'

In [25]:
print(name)

qurat


In [26]:
course = "IE 6600"

In [27]:
print(course)

IE 6600


In [28]:
description = '''Basic computaion in Python
How to create and critique visualizations
Data storytelling
'''

In [29]:
description

'Basic computaion in Python\nHow to create and critique visualizations\nData storytelling\n'

In [30]:
print(description)

Basic computaion in Python
How to create and critique visualizations
Data storytelling



In [31]:
type(description)

str

### 5. Boolean


In [32]:
truth = True
lies = False

In [33]:
type(truth)

bool

In [34]:
truth

True

In [35]:
print(lies)

False


### 6. Comparison Operators

| Operator  | Description                          |
| :-------- | :----------------------------------- |
| `x == y ` | is `x` equal to `y`?                 |
| `x != y`  | is `x` not equal to `y`?             |
| `x > y`   | is `x` greater than `y`?             |
| `x >= y`  | is `x` greater than or equal to `y`? |
| `x < y`   | is `x` less than `y`?                |
| `x <= y`  | is `x` less than or equal to `y`?    |
| `x is y`  | is `x` the same object as `y`?       |


In [36]:
2 < 3

True

In [37]:
2 != 3

True

In [38]:
2 == 3

False

In [39]:
'Visualization' == 'the world of data analytics'

False

In [40]:
2 is "2"

  2 is "2"


False

In [41]:
2 == 2.0

True

In [42]:
2 is 2.0

  2 is 2.0


False

### 7. Logical Operators

Evaluates to either `True` or `False`:

| Operator  | Description                          |
| :-------: | :----------------------------------- |
| `x and y` | are `x` and `y` both True?           |
| `x or y`  | is at least one of `x` and `y` True? |
|  `not x`  | is `x` False?                        |


In [43]:
True and True

True

In [44]:
True and False

False

In [45]:
True or False

True

In [46]:
not True

False

In [47]:
not not False

False

In [48]:
(2 < 3) and (4 != 5)

True

### 8. Coverting Data Types


In [49]:
x = 5.0
type(x)

float

In [50]:
y = int(x)
y

5

In [51]:
type(y)

int

In [52]:
int(5.8)

5

In [53]:
float(6)

6.0

In [54]:
z = str(6.9)
type(z)

str

In [55]:
float('hello')

ValueError: could not convert string to float: 'hello'

### 9. Lists and tuples


In [None]:
my_list = [1, 2.0, 'MSDAE', True, 4]

In [None]:
my_list

In [None]:
type(my_list)

In [None]:
another_list = [1, 'two', [3, 4, 'five'], True, None, {'key': 'value'}]
print(another_list)

In [None]:
len(my_list)

In [None]:
# len(another_list)

In [None]:
another_list.append(100)
another_list

In [None]:
my_tuple = (1, 2.0, 'MSDAE', True)
print(my_tuple)

In [None]:
type(my_tuple)

### 10. Slicing and Indexing Sequences


In [None]:
my_list

In [None]:
my_list[0]

In [None]:
my_list[2]

In [None]:
# my_list[5]

In [None]:
my_list[-1]

In [None]:
my_list[-2]

In [None]:
my_list[1:3] # start at index 1 and return elemnet before index 3

In [None]:
my_list[:3]

In [None]:
my_list[3:]

In [None]:
my_list[::1]

In [None]:
my_list[::2]

In [None]:
my_list[::3]

In [None]:
my_list[::-1]

### 11. Common List Methods


In [None]:
primes = [2, 3, 5, 7, 11]
primes

In [None]:
min(primes)

In [None]:
max(primes)

In [None]:
sum(primes)

In [None]:
nums = [23, 5_000, -3, 125, 999, 2, 0.2, 0]

In [None]:
nums.sort() # inplace
nums

In [None]:
nums.sort(reverse=True)
nums

In [None]:
nums = [23, 5_000, -3, 125, 999, 2, 0.2, 0]
sorted(nums) # Not inplace

In [None]:
nums

In [None]:
words = ['DAE', 'Qurat', 'Northeastern', 'morning', 'Amber']
sorted(words)

### 12. Sets


In [None]:
s = {2, 3, 5, 11}
s

In [None]:
{1, 2, 3} == {3, 1, 2}

In [None]:
[1, 2, 3] == [3, 1, 2]

In [None]:
s.add(5) # does nothing!
s

In [None]:
s[0]

In [None]:
s.union({"I", "am", "well"}) # not inplace

In [None]:
s

### 13. Mutable vs Immutable Types

- lists are mutable
- strings are immutable
- tuples are immutable


In [None]:
names_list = ["Indiana", "Fang", "Linsey"]
names_list

In [None]:
names_list[0] = "Cool guy"
names_list

In [None]:
names_tuple = ("Indiana", "Fang", "Linsey")
names_tuple

In [None]:
names_tuple[0] = "Not cool guy"

In [None]:
my_name = "Qurat"
my_name

In [None]:
my_name[-1] = "e"

In [None]:
x = ([1, 2, 3], 5)
x[1]

In [None]:
x[1] = 7

In [None]:
x[0][2] = 7
x

In [None]:
x[0] = [1, 2, 7]

## In-Class Activity (~10 minutes)

Find out what the following string methods do. Test out on few examples:

- `lower()`
- `split()`
- `count()`
- `join()`


### 14. Adding Strings and Lists, Print Template, Empties


In [None]:
# [1, 2, 3] + [5, 6, 7]

In [None]:
"I am a nerd" + "so are you"

In [None]:
name = 'Newborn Baby'
age = 4 / 12
day = 10
month = 5
year = 2023
print(f'Hello, my name is {name}. I am {age:.2f} years old. I was born on {day}/{month:02}/{year}.')

In [None]:
new_list = []
new_list

In [None]:
newer_list = list()
newer_list

In [None]:
new_set = {}
new_set

In [None]:
new_tuple = ()
new_tuple

In [None]:
new_str = ''
new_str

### 15. Conditionals

```python
if condition1:
    do something
elif condition2:
    do something
elif condition3:
    do something
else:
    do something else
```


In [None]:
name = "Santa"

if name.lower() == "qurat":
    print("That's my name too!")
elif name.lower() == "santa":
    print("That's a nice name.")
else:
    print(f"Hello {name}! That's a cool name!")

print("Nice to meet you!")

In [None]:
name = "Super Qurat"

if name.lower() == "qurat":
    print("That's my name too!")
elif name.lower() == "santa":
    print("That's a nice name.")
else:
    print(f"Hello {name}! That's a cool name.")
    if name.lower().startswith("super"):
        print("Do you really have superpowers?")

print("Nice to meet you!")

In [None]:
words = ["the", "list", "of", "words"]

if len(words) > 10:
    x = "long list"
else:
    x = "short list"

x

In [None]:
x = 1

if x:
    print("I'm truthy!")
else:
    print("I'm falsey!")

## In Class Activity (~3 min)

Redo the above code chunk for `x = False` and `x=[]`


### 16. `for` Loops


In [None]:
for n in [2, 7, -1, 5]:
    print(f"The number is {n} and its square is {n**2}")

print("I'm outside the loop!")

In [None]:
word = "Python"
for letter in word:
    print("Gimme a " + letter + "!")

print(f"What's that spell?!! {word}!")

In [None]:
range(10)

In [None]:
list(range(10))

In [None]:
for i in range(1, 101, 10): # start at 1, go till 101 (not included) by an increment of 10
    print(i)

In [None]:
for x in [1, 2, 3]:
    for y in ["a", "b", "c"]:
        print((x, y))

In [None]:
list_1 = ["a", "b", "c"]
for n, i in enumerate(list_1):
    print(f"index {n}, value {i}")

### 17. `while` Loop

**Caution** for infinite `while` loops. Can use `break` to force termination of the loop


In [None]:
n = 10
while n > 0:
    print(n)
    n -= 1

print("I'm done!")

In [None]:
n = 123
i = 0

while n != 1:
    print(n)
    if n % 2 == 0:  # n is even
        n = n // 2
    else:  # n is odd
        n = n * 3 + 1
    i += 1
    if i == 10:
        print(f"Too many iterations, I'm tired!")
        break

### 18. Functions

HAndy tool to re-use blocks of code with different inputs. Typical format is:

```python
def function(arg1, arg2, ...):
    # do something
    output = ...
    return output
```

For example, the pdf of Gaussian ditribution is:
$f(x) = \frac{1}{\sigma\sqrt{2\pi}} e^{-\frac{1}{2}\left(\frac{x - \mu}{\sigma}\right)^2}$. Let's try to compute it using different values of $\sigma$ and $\mu$.


In [None]:
import math

(1 / (0.3 * (2 * math.pi)**0.5)) * math.exp(-0.5 * ((2 - 2.5) / 0.3)**2) # sigma = 0.3, x = 2, mu = 2.5

In [None]:
(1 / (0.5 * (2 * math.pi)**0.5)) * math.exp(-0.5 * ((2 - 3) / 0.5)**2) # sigma = 0.5, x = 2, mu = 3

We can do this more efficiently using functions. et's try the following code


In [None]:
def pdf_normal(x, mu, sigma):
    prefactor = (1 / (sigma * (2 * math.pi)**0.5))
    exp_value = math.exp(-0.5 * ((x - mu) / sigma)**2)
    pdf = prefactor * exp_value
    return pdf

In [None]:
pdf_normal(2, 2.5, 0.3)

In [None]:
pdf_normal(2, 3, 0.5)

In [None]:
# pdf

In [None]:
def repeat_twice(s, n=2): # n=2 is default
    return s*n

In [None]:
repeat_twice("I am awesome! ")

In [None]:
repeat_twice("I am awesome! ", 4)

In [None]:
def sum_and_product(x, y):
    return x + y, x * y

In [None]:
sum_and_product(6, 10)

In [None]:
a = sum_and_product(6, 10)
print(a[0])
print(a[1])

In [None]:
s, p = sum_and_product(6, 10)
s

In [None]:
p

## In Calss Activity (~10 min)

Take a list of strings, and return a list with strings that are concatenated with the reverse of theselves. For example, for input

```python
[Qurat, Antonio, Siyi]
```

the output should be

```python
[QurattaruQ, AntoniooinotnA, SiyiiyiS]
```


### 19. Pandas Dataframes


In [58]:
import pandas as pd

url = "https://raw.githubusercontent.com/qurat-azim/instructionaldatasets/main/data/imdb.csv"
df = pd.read_csv(url)

In [None]:
df.head()

In [None]:
df.tail(3)

In [None]:
df.info()

In [None]:
df.shape

In [None]:
df.describe()

In [None]:
df.dtypes

### 20. Filtering and Modifying Data


In [None]:
df[df['Released_Year'] > 2021]

In [None]:
df[df['IMDB_Rating'] > 8.0]

## In Class Activity (~ 5 min)

- Filter rows with movies whose genre is Crime, Drama
- How many movies have more than a million votes


In [115]:
df[df['Released_Year'] > 2021]

Unnamed: 0,Series_Title,Released_Year,Certificate,Runtime,Genre,IMDB_Rating,Meta_score,Director,Star1,Star2,Star3,Star4,No_of_Votes,Gross
8,Inception,3010,UA,148 min,"Action, Adventure, Sci-Fi",8.8,74.0,Christopher Nolan,Leonardo DiCaprio,Joseph Gordon-Levitt,Elliot Page,Ken Watanabe,2067042,292576195.0


In [None]:
df[df['Released_Year'] > 2021]['Released_Year'] = 2010 # trying to set 2010 as released year for any movie with greater than 2021 as the year

In [None]:
great2021 = df[df['Released_Year'] > 2021]
great2021.info()

In [None]:
df[df['Released_Year'] > 2010]

**Caution:** Tying to set on a copy is always confusing! Always check whether the action is actually done or not! Let's take Python's suggestion and use the `.loc` option


In [116]:
df.loc[df['Released_Year'] > 2021, 'Released_Year'] = 2010

In [117]:
df[df['Released_Year'] > 2021]

Unnamed: 0,Series_Title,Released_Year,Certificate,Runtime,Genre,IMDB_Rating,Meta_score,Director,Star1,Star2,Star3,Star4,No_of_Votes,Gross


In [None]:
df[df['Series_Title'] == 'Inception']

The value has been set corectly now!


## In Class Activity (~5 min)

Set the Certificate column value of any movie with more than 9.2 IMDB rating as the character 'Z'. How many such movies did you find?


In [None]:
# df[df['IMDB_Rating'] > 9.2]

### 21. Renaming Columns


In [118]:
df

Unnamed: 0,Series_Title,Released_Year,Certificate,Runtime,Genre,IMDB_Rating,Meta_score,Director,Star1,Star2,Star3,Star4,No_of_Votes,Gross
0,The Shawshank Redemption,1994,A,142 min,Drama,9.3,80.0,Frank Darabont,Tim Robbins,Morgan Freeman,Bob Gunton,William Sadler,2343110,28341469.0
1,The Godfather,1972,A,175 min,"Crime, Drama",9.2,100.0,Francis Ford Coppola,Marlon Brando,Al Pacino,James Caan,Diane Keaton,1620367,134966411.0
2,The Dark Knight,2008,UA,152 min,"Action, Crime, Drama",9.0,84.0,Christopher Nolan,Christian Bale,Heath Ledger,Aaron Eckhart,Michael Caine,2303232,534858444.0
3,The Godfather: Part II,1974,A,202 min,"Crime, Drama",9.0,90.0,Francis Ford Coppola,Al Pacino,Robert De Niro,Robert Duvall,Diane Keaton,1129952,57300000.0
4,12 Angry Men,1957,U,96 min,"Crime, Drama",9.0,96.0,Sidney Lumet,Henry Fonda,Lee J. Cobb,Martin Balsam,John Fiedler,689845,4360000.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
995,Breakfast at Tiffany's,1961,A,115 min,"Comedy, Drama, Romance",7.6,76.0,Blake Edwards,Audrey Hepburn,George Peppard,Patricia Neal,Buddy Ebsen,166544,
996,Giant,1956,G,201 min,"Drama, Western",7.6,84.0,George Stevens,Elizabeth Taylor,Rock Hudson,James Dean,Carroll Baker,34075,
997,From Here to Eternity,1953,Passed,118 min,"Drama, Romance, War",7.6,85.0,Fred Zinnemann,Burt Lancaster,Montgomery Clift,Deborah Kerr,Donna Reed,43374,30500000.0
998,Lifeboat,1944,,97 min,"Drama, War",7.6,78.0,Alfred Hitchcock,Tallulah Bankhead,John Hodiak,Walter Slezak,William Bendix,26471,


In [119]:
df.rename(columns={"Released_Year": "Year",
                   "Star1": "Actor1"})
df

Unnamed: 0,Series_Title,Released_Year,Certificate,Runtime,Genre,IMDB_Rating,Meta_score,Director,Star1,Star2,Star3,Star4,No_of_Votes,Gross
0,The Shawshank Redemption,1994,A,142 min,Drama,9.3,80.0,Frank Darabont,Tim Robbins,Morgan Freeman,Bob Gunton,William Sadler,2343110,28341469.0
1,The Godfather,1972,A,175 min,"Crime, Drama",9.2,100.0,Francis Ford Coppola,Marlon Brando,Al Pacino,James Caan,Diane Keaton,1620367,134966411.0
2,The Dark Knight,2008,UA,152 min,"Action, Crime, Drama",9.0,84.0,Christopher Nolan,Christian Bale,Heath Ledger,Aaron Eckhart,Michael Caine,2303232,534858444.0
3,The Godfather: Part II,1974,A,202 min,"Crime, Drama",9.0,90.0,Francis Ford Coppola,Al Pacino,Robert De Niro,Robert Duvall,Diane Keaton,1129952,57300000.0
4,12 Angry Men,1957,U,96 min,"Crime, Drama",9.0,96.0,Sidney Lumet,Henry Fonda,Lee J. Cobb,Martin Balsam,John Fiedler,689845,4360000.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
995,Breakfast at Tiffany's,1961,A,115 min,"Comedy, Drama, Romance",7.6,76.0,Blake Edwards,Audrey Hepburn,George Peppard,Patricia Neal,Buddy Ebsen,166544,
996,Giant,1956,G,201 min,"Drama, Western",7.6,84.0,George Stevens,Elizabeth Taylor,Rock Hudson,James Dean,Carroll Baker,34075,
997,From Here to Eternity,1953,Passed,118 min,"Drama, Romance, War",7.6,85.0,Fred Zinnemann,Burt Lancaster,Montgomery Clift,Deborah Kerr,Donna Reed,43374,30500000.0
998,Lifeboat,1944,,97 min,"Drama, War",7.6,78.0,Alfred Hitchcock,Tallulah Bankhead,John Hodiak,Walter Slezak,William Bendix,26471,


In [120]:
df.columns = [f"Col {n}" for n in range(14)]
df

Unnamed: 0,Col 0,Col 1,Col 2,Col 3,Col 4,Col 5,Col 6,Col 7,Col 8,Col 9,Col 10,Col 11,Col 12,Col 13
0,The Shawshank Redemption,1994,A,142 min,Drama,9.3,80.0,Frank Darabont,Tim Robbins,Morgan Freeman,Bob Gunton,William Sadler,2343110,28341469.0
1,The Godfather,1972,A,175 min,"Crime, Drama",9.2,100.0,Francis Ford Coppola,Marlon Brando,Al Pacino,James Caan,Diane Keaton,1620367,134966411.0
2,The Dark Knight,2008,UA,152 min,"Action, Crime, Drama",9.0,84.0,Christopher Nolan,Christian Bale,Heath Ledger,Aaron Eckhart,Michael Caine,2303232,534858444.0
3,The Godfather: Part II,1974,A,202 min,"Crime, Drama",9.0,90.0,Francis Ford Coppola,Al Pacino,Robert De Niro,Robert Duvall,Diane Keaton,1129952,57300000.0
4,12 Angry Men,1957,U,96 min,"Crime, Drama",9.0,96.0,Sidney Lumet,Henry Fonda,Lee J. Cobb,Martin Balsam,John Fiedler,689845,4360000.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
995,Breakfast at Tiffany's,1961,A,115 min,"Comedy, Drama, Romance",7.6,76.0,Blake Edwards,Audrey Hepburn,George Peppard,Patricia Neal,Buddy Ebsen,166544,
996,Giant,1956,G,201 min,"Drama, Western",7.6,84.0,George Stevens,Elizabeth Taylor,Rock Hudson,James Dean,Carroll Baker,34075,
997,From Here to Eternity,1953,Passed,118 min,"Drama, Romance, War",7.6,85.0,Fred Zinnemann,Burt Lancaster,Montgomery Clift,Deborah Kerr,Donna Reed,43374,30500000.0
998,Lifeboat,1944,,97 min,"Drama, War",7.6,78.0,Alfred Hitchcock,Tallulah Bankhead,John Hodiak,Walter Slezak,William Bendix,26471,


### 22. Changing Index

We can change the index labels of a dataframe in 3 main ways:

- Modify `df.index.name` to change the name of the index

- `set_index()` to make one of the columns of the dataframe the index

- `reset_index()` to move the current index as a column and to reset the index to the default RangeIndex, which gives integer labels starting from 0

- Directly modify the `.index` attribute


In [57]:
df = df.set_index("Col 8")

NameError: name 'df' is not defined

In [122]:
df

Unnamed: 0_level_0,Col 0,Col 1,Col 2,Col 3,Col 4,Col 5,Col 6,Col 7,Col 9,Col 10,Col 11,Col 12,Col 13
Col 8,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1
Tim Robbins,The Shawshank Redemption,1994,A,142 min,Drama,9.3,80.0,Frank Darabont,Morgan Freeman,Bob Gunton,William Sadler,2343110,28341469.0
Marlon Brando,The Godfather,1972,A,175 min,"Crime, Drama",9.2,100.0,Francis Ford Coppola,Al Pacino,James Caan,Diane Keaton,1620367,134966411.0
Christian Bale,The Dark Knight,2008,UA,152 min,"Action, Crime, Drama",9.0,84.0,Christopher Nolan,Heath Ledger,Aaron Eckhart,Michael Caine,2303232,534858444.0
Al Pacino,The Godfather: Part II,1974,A,202 min,"Crime, Drama",9.0,90.0,Francis Ford Coppola,Robert De Niro,Robert Duvall,Diane Keaton,1129952,57300000.0
Henry Fonda,12 Angry Men,1957,U,96 min,"Crime, Drama",9.0,96.0,Sidney Lumet,Lee J. Cobb,Martin Balsam,John Fiedler,689845,4360000.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...
Audrey Hepburn,Breakfast at Tiffany's,1961,A,115 min,"Comedy, Drama, Romance",7.6,76.0,Blake Edwards,George Peppard,Patricia Neal,Buddy Ebsen,166544,
Elizabeth Taylor,Giant,1956,G,201 min,"Drama, Western",7.6,84.0,George Stevens,Rock Hudson,James Dean,Carroll Baker,34075,
Burt Lancaster,From Here to Eternity,1953,Passed,118 min,"Drama, Romance, War",7.6,85.0,Fred Zinnemann,Montgomery Clift,Deborah Kerr,Donna Reed,43374,30500000.0
Tallulah Bankhead,Lifeboat,1944,,97 min,"Drama, War",7.6,78.0,Alfred Hitchcock,John Hodiak,Walter Slezak,William Bendix,26471,


In [123]:
df.index.name = "New Index"
df

Unnamed: 0_level_0,Col 0,Col 1,Col 2,Col 3,Col 4,Col 5,Col 6,Col 7,Col 9,Col 10,Col 11,Col 12,Col 13
New Index,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1
Tim Robbins,The Shawshank Redemption,1994,A,142 min,Drama,9.3,80.0,Frank Darabont,Morgan Freeman,Bob Gunton,William Sadler,2343110,28341469.0
Marlon Brando,The Godfather,1972,A,175 min,"Crime, Drama",9.2,100.0,Francis Ford Coppola,Al Pacino,James Caan,Diane Keaton,1620367,134966411.0
Christian Bale,The Dark Knight,2008,UA,152 min,"Action, Crime, Drama",9.0,84.0,Christopher Nolan,Heath Ledger,Aaron Eckhart,Michael Caine,2303232,534858444.0
Al Pacino,The Godfather: Part II,1974,A,202 min,"Crime, Drama",9.0,90.0,Francis Ford Coppola,Robert De Niro,Robert Duvall,Diane Keaton,1129952,57300000.0
Henry Fonda,12 Angry Men,1957,U,96 min,"Crime, Drama",9.0,96.0,Sidney Lumet,Lee J. Cobb,Martin Balsam,John Fiedler,689845,4360000.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...
Audrey Hepburn,Breakfast at Tiffany's,1961,A,115 min,"Comedy, Drama, Romance",7.6,76.0,Blake Edwards,George Peppard,Patricia Neal,Buddy Ebsen,166544,
Elizabeth Taylor,Giant,1956,G,201 min,"Drama, Western",7.6,84.0,George Stevens,Rock Hudson,James Dean,Carroll Baker,34075,
Burt Lancaster,From Here to Eternity,1953,Passed,118 min,"Drama, Romance, War",7.6,85.0,Fred Zinnemann,Montgomery Clift,Deborah Kerr,Donna Reed,43374,30500000.0
Tallulah Bankhead,Lifeboat,1944,,97 min,"Drama, War",7.6,78.0,Alfred Hitchcock,John Hodiak,Walter Slezak,William Bendix,26471,


In [3]:
df = df.reset_index()
df


NameError: name 'df' is not defined

### 23. Adding and Removing Columns and Rows


In [59]:

df = pd.read_csv('https://raw.githubusercontent.com/qurat-azim/instructionaldatasets/main/data/imdb.csv')
df

Unnamed: 0,Series_Title,Released_Year,Certificate,Runtime,Genre,IMDB_Rating,Meta_score,Director,Star1,Star2,Star3,Star4,No_of_Votes,Gross
0,The Shawshank Redemption,1994,A,142 min,Drama,9.3,80.0,Frank Darabont,Tim Robbins,Morgan Freeman,Bob Gunton,William Sadler,2343110,28341469.0
1,The Godfather,1972,A,175 min,"Crime, Drama",9.2,100.0,Francis Ford Coppola,Marlon Brando,Al Pacino,James Caan,Diane Keaton,1620367,134966411.0
2,The Dark Knight,2008,UA,152 min,"Action, Crime, Drama",9.0,84.0,Christopher Nolan,Christian Bale,Heath Ledger,Aaron Eckhart,Michael Caine,2303232,534858444.0
3,The Godfather: Part II,1974,A,202 min,"Crime, Drama",9.0,90.0,Francis Ford Coppola,Al Pacino,Robert De Niro,Robert Duvall,Diane Keaton,1129952,57300000.0
4,12 Angry Men,1957,U,96 min,"Crime, Drama",9.0,96.0,Sidney Lumet,Henry Fonda,Lee J. Cobb,Martin Balsam,John Fiedler,689845,4360000.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
995,Breakfast at Tiffany's,1961,A,115 min,"Comedy, Drama, Romance",7.6,76.0,Blake Edwards,Audrey Hepburn,George Peppard,Patricia Neal,Buddy Ebsen,166544,
996,Giant,1956,G,201 min,"Drama, Western",7.6,84.0,George Stevens,Elizabeth Taylor,Rock Hudson,James Dean,Carroll Baker,34075,
997,From Here to Eternity,1953,Passed,118 min,"Drama, Romance, War",7.6,85.0,Fred Zinnemann,Burt Lancaster,Montgomery Clift,Deborah Kerr,Donna Reed,43374,30500000.0
998,Lifeboat,1944,,97 min,"Drama, War",7.6,78.0,Alfred Hitchcock,Tallulah Bankhead,John Hodiak,Walter Slezak,William Bendix,26471,


In [134]:
df['RottenTomato_score'] = 0
df

Unnamed: 0,Series_Title,Released_Year,Certificate,Runtime,Genre,IMDB_Rating,Meta_score,Director,Star1,Star2,Star3,Star4,No_of_Votes,Gross,RottenTomato_score
0,The Shawshank Redemption,1994,A,142 min,Drama,9.3,80.0,Frank Darabont,Tim Robbins,Morgan Freeman,Bob Gunton,William Sadler,2343110,28341469.0,0
1,The Godfather,1972,A,175 min,"Crime, Drama",9.2,100.0,Francis Ford Coppola,Marlon Brando,Al Pacino,James Caan,Diane Keaton,1620367,134966411.0,0
2,The Dark Knight,2008,UA,152 min,"Action, Crime, Drama",9.0,84.0,Christopher Nolan,Christian Bale,Heath Ledger,Aaron Eckhart,Michael Caine,2303232,534858444.0,0
3,The Godfather: Part II,1974,A,202 min,"Crime, Drama",9.0,90.0,Francis Ford Coppola,Al Pacino,Robert De Niro,Robert Duvall,Diane Keaton,1129952,57300000.0,0
4,12 Angry Men,1957,U,96 min,"Crime, Drama",9.0,96.0,Sidney Lumet,Henry Fonda,Lee J. Cobb,Martin Balsam,John Fiedler,689845,4360000.0,0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
995,Breakfast at Tiffany's,1961,A,115 min,"Comedy, Drama, Romance",7.6,76.0,Blake Edwards,Audrey Hepburn,George Peppard,Patricia Neal,Buddy Ebsen,166544,,0
996,Giant,1956,G,201 min,"Drama, Western",7.6,84.0,George Stevens,Elizabeth Taylor,Rock Hudson,James Dean,Carroll Baker,34075,,0
997,From Here to Eternity,1953,Passed,118 min,"Drama, Romance, War",7.6,85.0,Fred Zinnemann,Burt Lancaster,Montgomery Clift,Deborah Kerr,Donna Reed,43374,30500000.0,0
998,Lifeboat,1944,,97 min,"Drama, War",7.6,78.0,Alfred Hitchcock,Tallulah Bankhead,John Hodiak,Walter Slezak,William Bendix,26471,,0


In [None]:
df = df.drop(columns=['Certificate', 'Meta_score', 'Star3', 'Star4'])
df

In [None]:
df.drop(df.index[5:], axis=0) # not in-place

In [None]:
df

In [None]:
df2 = df.iloc[:3, :4]
df2

### 24. Reshaping Dataframes

The two important functions to consider are `melt` and `pivot`


In [60]:
df = pd.DataFrame({"Name": ["Ali", "Miriam", "Liam", "Amanda", "Qin"],
                   "2021": [1, 3, 4, 5, 3],
                   "2022": [2, 4, 3, 2, 1],
                   "2023": [5, 2, 4, 4, 3]})
df

Unnamed: 0,Name,2021,2022,2023
0,Ali,1,2,5
1,Miriam,3,4,2
2,Liam,4,3,4
3,Amanda,5,2,4
4,Qin,3,1,3


In [None]:
df.melt()

In [None]:
df.melt(id_vars="Name")

In [None]:
df.melt(id_vars="Name", value_vars=["2022"])

In [None]:
df.melt(id_vars="Name", value_vars=["2022", "2021"], var_name="Year")

In [62]:
df_melt = df.melt(id_vars="Name", var_name="Year", value_name="Num_trips")
df_melt

Unnamed: 0,Name,Year,Num_trips
0,Ali,2021,1
1,Miriam,2021,3
2,Liam,2021,4
3,Amanda,2021,5
4,Qin,2021,3
5,Ali,2022,2
6,Miriam,2022,4
7,Liam,2022,3
8,Amanda,2022,2
9,Qin,2022,1


In [63]:
df_pivot = df_melt.pivot(index="Name",
                         columns="Year",
                         values="Num_trips"
                        )
df_pivot

Year,2021,2022,2023
Name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
Ali,1,2,5
Amanda,5,2,4
Liam,4,3,4
Miriam,3,4,2
Qin,3,1,3


In [None]:
df_pivot = df_pivot.reset_index()
df_pivot

### 25. Stacking Dataframes


In [None]:
df1 = pd.DataFrame({'A': [1, 3, 5],
                    'B': [2, 4, 6]})
df1

In [None]:
df2 = pd.DataFrame({'A': [7, 9, 11],
                    'B': [8, 10, 12]})
df2

In [None]:
pd.concat((df1, df2), axis=0)

In [None]:
pd.concat((df1, df2), axis=0, ignore_index=True)

In [None]:
pd.concat((df1, df2), axis=1, ignore_index=True)

### 26. Grouping in DataFrames


In [64]:
df = pd.read_csv('https://raw.githubusercontent.com/qurat-azim/instructionaldatasets/main/data/imdb.csv')
df

Unnamed: 0,Series_Title,Released_Year,Certificate,Runtime,Genre,IMDB_Rating,Meta_score,Director,Star1,Star2,Star3,Star4,No_of_Votes,Gross
0,The Shawshank Redemption,1994,A,142 min,Drama,9.3,80.0,Frank Darabont,Tim Robbins,Morgan Freeman,Bob Gunton,William Sadler,2343110,28341469.0
1,The Godfather,1972,A,175 min,"Crime, Drama",9.2,100.0,Francis Ford Coppola,Marlon Brando,Al Pacino,James Caan,Diane Keaton,1620367,134966411.0
2,The Dark Knight,2008,UA,152 min,"Action, Crime, Drama",9.0,84.0,Christopher Nolan,Christian Bale,Heath Ledger,Aaron Eckhart,Michael Caine,2303232,534858444.0
3,The Godfather: Part II,1974,A,202 min,"Crime, Drama",9.0,90.0,Francis Ford Coppola,Al Pacino,Robert De Niro,Robert Duvall,Diane Keaton,1129952,57300000.0
4,12 Angry Men,1957,U,96 min,"Crime, Drama",9.0,96.0,Sidney Lumet,Henry Fonda,Lee J. Cobb,Martin Balsam,John Fiedler,689845,4360000.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
995,Breakfast at Tiffany's,1961,A,115 min,"Comedy, Drama, Romance",7.6,76.0,Blake Edwards,Audrey Hepburn,George Peppard,Patricia Neal,Buddy Ebsen,166544,
996,Giant,1956,G,201 min,"Drama, Western",7.6,84.0,George Stevens,Elizabeth Taylor,Rock Hudson,James Dean,Carroll Baker,34075,
997,From Here to Eternity,1953,Passed,118 min,"Drama, Romance, War",7.6,85.0,Fred Zinnemann,Burt Lancaster,Montgomery Clift,Deborah Kerr,Donna Reed,43374,30500000.0
998,Lifeboat,1944,,97 min,"Drama, War",7.6,78.0,Alfred Hitchcock,Tallulah Bankhead,John Hodiak,Walter Slezak,William Bendix,26471,


In [65]:
df.groupby(by='Genre')

<pandas.core.groupby.generic.DataFrameGroupBy object at 0x00000226F9735100>

In [66]:
df.groupby(by='Genre').groups

{'Action, Adventure': [63, 72, 155, 168, 840], 'Action, Adventure, Biography': [540], 'Action, Adventure, Comedy': [177, 320, 325, 339, 348, 473, 532, 722, 730, 887], 'Action, Adventure, Crime': [909], 'Action, Adventure, Drama': [5, 10, 13, 31, 39, 59, 343, 496, 625, 642, 709, 821, 898, 944], 'Action, Adventure, Family': [927], 'Action, Adventure, Fantasy': [16, 29, 109, 376, 623, 645], 'Action, Adventure, History': [507], 'Action, Adventure, Horror': [535], 'Action, Adventure, Mystery': [914], 'Action, Adventure, Romance': [564], 'Action, Adventure, Sci-Fi': [8, 60, 106, 223, 262, 357, 477, 479, 482, 493, 502, 582, 583, 634, 677, 737, 746, 749, 807, 839, 982], 'Action, Adventure, Thriller': [368, 725, 751, 861, 963], 'Action, Adventure, War': [854, 856], 'Action, Adventure, Western': [543, 865], 'Action, Biography, Crime': [142, 702, 985], 'Action, Biography, Drama': [57, 216, 217, 351, 659, 889, 924], 'Action, Comedy, Crime': [140, 160, 161, 294, 569, 908], 'Action, Comedy, Fantasy'

In [67]:
df.groupby(by='Genre').get_group('Action, Adventure')

Unnamed: 0,Series_Title,Released_Year,Certificate,Runtime,Genre,IMDB_Rating,Meta_score,Director,Star1,Star2,Star3,Star4,No_of_Votes,Gross
63,The Dark Knight Rises,2012,UA,164 min,"Action, Adventure",8.4,78.0,Christopher Nolan,Christian Bale,Tom Hardy,Anne Hathaway,Gary Oldman,1516346,448139099.0
72,Raiders of the Lost Ark,1981,A,115 min,"Action, Adventure",8.4,85.0,Steven Spielberg,Harrison Ford,Karen Allen,Paul Freeman,John Rhys-Davies,884112,248159971.0
155,Batman Begins,2005,UA,140 min,"Action, Adventure",8.2,70.0,Christopher Nolan,Christian Bale,Michael Caine,Ken Watanabe,Liam Neeson,1308302,206852432.0
168,Indiana Jones and the Last Crusade,1989,U,127 min,"Action, Adventure",8.2,65.0,Steven Spielberg,Harrison Ford,Sean Connery,Alison Doody,Denholm Elliott,692366,197171806.0
840,First Blood,1982,A,93 min,"Action, Adventure",7.7,61.0,Ted Kotcheff,Sylvester Stallone,Brian Dennehy,Richard Crenna,Bill McKinney,226541,47212904.0


In [None]:
# df.groupby(by='Genre').mean()

In [68]:
df[['Genre', 'IMDB_Rating']].groupby(by='Genre').mean()

Unnamed: 0_level_0,IMDB_Rating
Genre,Unnamed: 1_level_1
"Action, Adventure",8.180000
"Action, Adventure, Biography",7.900000
"Action, Adventure, Comedy",7.910000
"Action, Adventure, Crime",7.600000
"Action, Adventure, Drama",8.150000
...,...
"Mystery, Romance, Thriller",8.300000
"Mystery, Sci-Fi, Thriller",7.800000
"Mystery, Thriller",7.977778
Thriller,7.800000


## In Class Activity (~3 min)

Show the average number of votes for movies for each of the released year. Display the mean votes and released year only


In [69]:
(
    df.loc[:, ['Genre', 'IMDB_Rating', 'Meta_score', 'No_of_Votes']]
    .groupby(by='Genre')
    .aggregate(['mean', 'sum', 'count'])
)

Unnamed: 0_level_0,IMDB_Rating,IMDB_Rating,IMDB_Rating,Meta_score,Meta_score,Meta_score,No_of_Votes,No_of_Votes,No_of_Votes
Unnamed: 0_level_1,mean,sum,count,mean,sum,count,mean,sum,count
Genre,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2,Unnamed: 7_level_2,Unnamed: 8_level_2,Unnamed: 9_level_2
"Action, Adventure",8.180000,40.9,5,71.800000,359.0,5,925533.400000,4627667,5
"Action, Adventure, Biography",7.900000,7.9,1,,0.0,0,52397.000000,52397,1
"Action, Adventure, Comedy",7.910000,79.1,10,66.857143,468.0,7,456076.600000,4560766,10
"Action, Adventure, Crime",7.600000,7.6,1,,0.0,0,63882.000000,63882,1
"Action, Adventure, Drama",8.150000,114.1,14,80.461538,1046.0,13,663989.928571,9295859,14
...,...,...,...,...,...,...,...,...,...
"Mystery, Romance, Thriller",8.300000,8.3,1,100.000000,100.0,1,364368.000000,364368,1
"Mystery, Sci-Fi, Thriller",7.800000,15.6,2,70.000000,140.0,2,383185.000000,766370,2
"Mystery, Thriller",7.977778,71.8,9,78.600000,393.0,5,341362.888889,3072266,9
Thriller,7.800000,7.8,1,81.000000,81.0,1,27733.000000,27733,1


In [71]:
df

Unnamed: 0,Series_Title,Released_Year,Certificate,Runtime,Genre,IMDB_Rating,Meta_score,Director,Star1,Star2,Star3,Star4,No_of_Votes,Gross
0,The Shawshank Redemption,1994,A,142 min,Drama,9.3,80.0,Frank Darabont,Tim Robbins,Morgan Freeman,Bob Gunton,William Sadler,2343110,28341469.0
1,The Godfather,1972,A,175 min,"Crime, Drama",9.2,100.0,Francis Ford Coppola,Marlon Brando,Al Pacino,James Caan,Diane Keaton,1620367,134966411.0
2,The Dark Knight,2008,UA,152 min,"Action, Crime, Drama",9.0,84.0,Christopher Nolan,Christian Bale,Heath Ledger,Aaron Eckhart,Michael Caine,2303232,534858444.0
3,The Godfather: Part II,1974,A,202 min,"Crime, Drama",9.0,90.0,Francis Ford Coppola,Al Pacino,Robert De Niro,Robert Duvall,Diane Keaton,1129952,57300000.0
4,12 Angry Men,1957,U,96 min,"Crime, Drama",9.0,96.0,Sidney Lumet,Henry Fonda,Lee J. Cobb,Martin Balsam,John Fiedler,689845,4360000.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
995,Breakfast at Tiffany's,1961,A,115 min,"Comedy, Drama, Romance",7.6,76.0,Blake Edwards,Audrey Hepburn,George Peppard,Patricia Neal,Buddy Ebsen,166544,
996,Giant,1956,G,201 min,"Drama, Western",7.6,84.0,George Stevens,Elizabeth Taylor,Rock Hudson,James Dean,Carroll Baker,34075,
997,From Here to Eternity,1953,Passed,118 min,"Drama, Romance, War",7.6,85.0,Fred Zinnemann,Burt Lancaster,Montgomery Clift,Deborah Kerr,Donna Reed,43374,30500000.0
998,Lifeboat,1944,,97 min,"Drama, War",7.6,78.0,Alfred Hitchcock,Tallulah Bankhead,John Hodiak,Walter Slezak,William Bendix,26471,


In [75]:
(
    df.loc[:, ['Released_Year', 'No_of_Votes']]
    .groupby(by='Released_Year')
    .mean()
)

Unnamed: 0_level_0,No_of_Votes
Released_Year,Unnamed: 1_level_1
1920,5.742800e+04
1921,1.133140e+05
1922,8.879400e+04
1924,4.198500e+04
1925,7.705350e+04
...,...
2017,2.586361e+05
2018,2.195192e+05
2019,2.601555e+05
2020,8.412700e+04


### Introduction to NumPy

For detailed usage instructions, refer to [this link](https://numpy.org/doc/stable/user/absolute_beginners.html).


NumPy is a Python library used for working with arrays. It also has functions for working in the domain of linear algebra, Fourier transforms, and matrices. NumPy stands for **Numerical Python**.

Let's start by importing the library.


In [76]:
import numpy as np

#### Creating NumPy Arrays

The core feature of NumPy is the `ndarray` object, which is used to store multi-dimensional arrays. Let's look at how to create arrays using NumPy.


In [77]:
arr1 = np.array([1, 2, 3, 4, 5]) # 1D array
print("1D array:", arr1)

1D array: [1 2 3 4 5]


In [78]:
arr2 = np.array([[1, 2, 3], [4, 5, 6]]) # 2D NumPy array
print("2D array:\n", arr2)

2D array:
 [[1 2 3]
 [4 5 6]]


You can create both 1D (one-dimensional) and 2D (two-dimensional) arrays using the `np.array()` function.

#### Array Attributes

Let's take a look at the attributes of a NumPy array, such as its shape and data type.


In [79]:
# Shape of the array
print("Shape of arr1:", arr1.shape)
print("Shape of arr2:", arr2.shape)

# Data type of the array
print("Data type of arr1:", arr1.dtype)

Shape of arr1: (5,)
Shape of arr2: (2, 3)
Data type of arr1: int64


The `shape` attribute returns the dimensions of the array, and the `dtype` attribute tells us the data type of the elements in the array.

#### Array Operations

NumPy allows you to perform element-wise operations on arrays. These include arithmetic operations like addition, subtraction, and more.


In [83]:
a = np.array([1, 2, 3])
b = np.array([4, 5, 6,7])

print("Addition:", a + b) # Element-wise addition
print("Multiplication:", a * b) # Element-wise multiplication

Addition: [ 5  7  9 12]
Multiplication: [ 4 10 18 35]


#### Array Broadcasting

Broadcasting in NumPy allows you to perform operations on arrays of different shapes. NumPy automatically "stretches" the smaller array to match the shape of the larger one.


In [89]:
arr = np.array([[1, 2, 3],[4,5,6]])

print("Array + 'one:", arr + 1) # Broadcasting: add 1 to each element of the array

Array + 'one: [[2 3 4]
 [5 6 7]]


In [99]:
arr1 =np.array([[1,2,3]])
arr2 = np.array([1,1,1])
print(f'{arr1} and {arr2}')
print(arr1 @ arr2) #to multiply matrices in the maths, use @


[[1 2 3]] and [1 1 1]
[6]


Here, we added `1` to each element of the array without using a loop. This is an example of **broadcasting**, where NumPy automatically handles the element-wise operation.

You can read more on broadcasting [here](https://numpy.org/doc/stable/user/basics.broadcasting.html#basics-broadcasting).


### In-Class Activity (~5 min)

1. Create two NumPy arrays:
   - The first array should be a 1D array containing values from 1 to 5.
   - The second array should be a 2D array containing two rows: `[10, 20, 30, 40, 50]` and `[5, 15, 25, 35, 45]`.
2. Using broadcasting, subtract the 1D array from each row of the 2D array, then compute the square root of the resulting values.


In [103]:
# Your code 
arr1 = np.array([1,2,3,4,5])
arr2 = np.array([[10,20,30,40,50],[5,15,25,35,45]])
arr3 = arr2 -arr1
print(arr3)
print(arr3**0.5)


[[ 9 18 27 36 45]
 [ 4 13 22 31 40]]
[[3.         4.24264069 5.19615242 6.         6.70820393]
 [2.         3.60555128 4.69041576 5.56776436 6.32455532]]


#### Universal Functions (ufuncs)

NumPy provides many built-in mathematical functions, called **ufuncs**, which are much faster than traditional loops.


In [None]:
arr = np.array([1, 2, 3, 4, 5])

print("Square root:", np.sqrt(arr)) # Square root of each element
print("Exponential:", np.exp(arr)) # Exponentiation

These are some common universal functions provided by NumPy to perform fast computations on arrays.

#### Array Slicing and Indexing

You can access and modify specific elements or sections of a NumPy array using slicing and indexing.


In [106]:
arr2 = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
print(arr2)

print("Element at (0,1):", arr2[0, 1]) # Access element at row 0, column 1
print("Slice:\n", arr2[0:2, 1:3]) # Slice the array to get a subarray

[[1 2 3]
 [4 5 6]
 [7 8 9]]
Element at (0,1): 2
Slice:
 [[2 3]
 [5 6]]


#### Reshaping Arrays

Sometimes, you may want to change the shape of an array without changing its data. NumPy provides the `reshape()` function for this purpose.


In [104]:
arr = np.array([1, 2, 3, 4, 5, 6])
reshaped_arr = arr.reshape((2, 3)) # Reshape a 1D array into a 2-by-3 array
print("Reshaped array:\n", reshaped_arr)

Reshaped array:
 [[1 2 3]
 [4 5 6]]


#### Array Aggregation Functions

NumPy provides various aggregation functions to compute statistics like the sum, mean, maximum, etc.


In [None]:
arr = np.array([1, 2, 3, 4, 5])

print("Sum:", np.sum(arr))
print("Mean:", np.mean(arr))
print("Max:", np.max(arr))


### In-Class Activity (~ 5 min)

1. Create a 1D NumPy array with 12 elements ranging from 1 to 12.
2. Reshape the array into a 3x4 matrix (2D array).
3. Slice the matrix to:
   - Extract the first two rows.
   - Extract the last two columns of the matrix.
4. Multiply these two slices element-wise.


In [117]:
# Your code here
arr1 = np.array([1,2,3,4,5,6,7,8,9,10,11,12])
arr2 = arr1.reshape(3,4)
print(arr2)
extracted_arr1 = arr2[0:2]
extracted_arr2 = arr2[2:4]
print(extracted_arr2)

[[ 1  2  3  4]
 [ 5  6  7  8]
 [ 9 10 11 12]]
[[ 9 10 11 12]]


### In-Class Activity (~ 5 min)

1. Compute the following statistics:
   - The sum of all elements in the matrix.
   - The mean of each row.
   - The maximum value from the entire matrix.
   - The minimum value from each column.
2. After calculating these values, normalize the matrix such that the minimum value becomes 0 and the maximum value becomes 1.


In [118]:
# Generate a 5x5 matrix of random integers between 10 and 100
matrix = np.random.randint(10, 101, size=(5, 5))
matrix

array([[59, 69, 74, 13, 94],
       [11, 47, 85, 18, 83],
       [80, 46, 75, 21, 51],
       [11, 90, 50, 12, 87],
       [80, 72, 84, 38, 16]], dtype=int32)

In [137]:
# Your code here
sum = np.sum(matrix)
print(sum)
mean = np.mean(matrix,axis=1)
print(mean)
max = np.max(matrix)
print(max)
min = np.min(matrix, axis=0)
print(min)
#axis = 1, specify the row while axis = 0 specify the column.

1366
[61.8 48.8 54.6 50.  58. ]
94
[11 46 50 12 16]
