# Part 1 (40 points): Intro to Markdown

Use the markdown language below to create your own brief wikipedia-esque description of a topic related to International Government and Politics. 

Your mini-wiki page must include:
- three headers: a title, subtitle and subsubtitle (the #, ##, ### syntax)
- an embedded image from a web address (use an [image hosting site](https://makeawebsitehub.com/free-photo-hosting/) if you'd like to upload your own)
- a **markdown** table of size at least 9 cells (i.e. 3 by 3, or 5 by 2)
- a list in **markdown**
- a link to another website

To practice typing in math mode, also include a LaTeX formula describing how your final grade is going to be calculated based on the syllabus, something like:

$$grade = weight_1*score_1 + weight_2*score_2$$

Please be **brief** in your text.  Aim for roughly 3 sentences total of text.

# European Immigration
This is a brief overview of immigration in Europe. 
![alt text](https://understandingrace.org/wp-content/uploads/2022/02/Arriving-at-Ellis-Island.jpg)
## The Trends behind European Immigration
European immigration varies greatly depending on multiple factors, including economic status, political situations, and country of origin. European immigration was also crosscontinental.
### European Immigration To North America
Between 1815 to 1915, roughly 30 million Europeans migrated to North America as it was perceived as the land of economic opportunity.
Here is [more information](https://www.britannica.com/topic/Great-Atlantic-Migration) on european immigration to North America
| 1 | 3 | 2 | 1 | 3 |
|---|---|---|---|---|
| 6 | 5 | 4 | 4 | 7 |
| 7 | 3 | 7 | 5 | 8 |
| 2 | 6 | 1 | 9 | 2 |

<ol>
  <li>United States</li>
  <li>Germany</li>
  <li>Canada</li>
</ol>


Math Mode Practice
$$grade= .4*homework + .2*labgrade + .2*quizone + .2*quiztwo$$




# Part 2: Numpy
## Part 2.1: Creating Arrays (10 points)

Create the following two arrays using the NumPy library and then print them out. Call the first array `array_a` and the second array `array_b` (make sure you keep the `import` statement below):

$$\mathbf{array_a} = \begin{bmatrix}3 & 8 & -2 & 3\\
.5 & -1 & 6 & 4\\
-5 & 7 & -42 & 2
\end{bmatrix}$$

$$\mathbf{array_b} = \begin{bmatrix}42 & 38 & 34\\
30 & 26 & 22\\
18 & 14 & 10\\
6 & 2 & -2\\
-6 & -10 & -14
\end{bmatrix}$$


In [3]:
# make sure to import numpy library
import numpy as np

In [7]:
array_a = np.array([
    [3,8,-2,3],
    [.5,-1,6,4],
    [-5,7,-42,2]
])
array_b = np.array([
    [42,38,34],
    [30,26,22],
    [18,14,10],
    [6,2,-2],
    [-6,-10,-14]
])

In [8]:
# uncomment below to print array_a
array_a

array([[  3. ,   8. ,  -2. ,   3. ],
       [  0.5,  -1. ,   6. ,   4. ],
       [ -5. ,   7. , -42. ,   2. ]])

In [9]:
# uncomment below to print array_b
array_b

array([[ 42,  38,  34],
       [ 30,  26,  22],
       [ 18,  14,  10],
       [  6,   2,  -2],
       [ -6, -10, -14]])

## Part 2.2: Exploring Arrays (15 points)

1. Give the shape, size, ndim, and nbytes for each of the two arrays.
1. Take the transpose of both arrays. Call these `t_array_a` and `t_array_b`.
1. Try to add `array_a` and `t_array_b` (*prove* and *show* you did this with commented out code), then remove the last column of `t_array_b` and try to add them again. In a markdown cell, explain what happened.

In [7]:
import numpy as np
array_a = np.array([[1, 2], [3, 4]])
array_b = np.array([[5, 6], [7, 8]])

# Shape, size, ndim, and nbytes of array_a
print(f"Shape = {array_a.shape}")
print(f"Size = {array_a.size}")
print(f"Dimensions = {array_a.ndim}")
print(f"Bytes = {array_a.nbytes}")

# Shape, size, ndim, and nbytes of array_b
print(f"Shape = {array_b.shape}")
print(f"Size = {array_b.size}")
print(f"Dimensions = {array_b.ndim}")
print(f"Bytes = {array_b.nbytes}")

# Transpose the arrays
t_array_a = array_a.T
t_array_b = array_b.T

# Print transposed arrays and their shapes
print(t_array_a)
print(f"{t_array_a.shape}")

print(t_array_b)
print(f"{t_array_b.shape}")

# Trying to add array_a to _array_b
# mat = array_a + t_array_b_mod


# Modify t_array_b by removing the last column
t_array_b_mod = t_array_b[:, :-1]


# Add array_a and modified b
mat = array_a + t_array_b_mod

# Print
print(mat)

Shape = (2, 2)
Size = 4
Dimensions = 2
Bytes = 32
Shape = (2, 2)
Size = 4
Dimensions = 2
Bytes = 32
[[1 3]
 [2 4]]
(2, 2)
[[5 7]
 [6 8]]
(2, 2)
[[ 6  7]
 [ 9 10]]


### Explanation

Initially, when we tried to add `array_a` and `t_array_b`, it caused an error because their Orientations were not aligned.

However, removing the last column from `t_array_b` gave still did not match `array_a`. That means we would need to ensure that the arrays are all the same shape to perform this type of addition. To truly make them addable, we would need to trim or pad the arrays accordingly.

In this case, we adjusted `t_array_b` to match the shape of `array_a` (or vice versa), and then the addition worked correctly.

# Part 3: Pandas
## Part 3.1: Reading in Data (5 points)

On Canvas is the `train_stations_europe.csv` file. It was adapted from [this Kaggle data set](https://www.kaggle.com/datasets/headsortails/train-stations-in-europe). Read this data set in, using the `id` as the index column, and print the first few rows of the data. Make sure you keep the `import` statement below!

In [12]:
# make sure to import pandas library
import pandas as pd

In [13]:
train_data = pd.read_csv('train_stations_europe.csv', index_col='id')
train_data.head()


Unnamed: 0_level_0,name,latitude,longitude,parent_station_id,country,time_zone,is_city,is_main_station,is_airport
id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1
1,Chateau-Arnoux-St-Auban,44.08179,6.001625,,FR,Europe/Paris,True,False,False
2,Chateau-Arnoux-St-Auban,44.061565,5.997373,1.0,FR,Europe/Paris,False,True,False
3,Chateau-Arnoux Mairie,44.063863,6.011248,1.0,FR,Europe/Paris,False,False,False
4,Digne-les-Bains,44.35,6.35,,FR,Europe/Paris,True,False,False
6,Digne-les-Bains,44.08871,6.222982,4.0,FR,Europe/Paris,False,True,False


## Part 3.2: Manipulating Data (20 points)

1. Create a subset of the data set which (a) **includes** only train stations in Belgium and (b) **excludes** all train stations which are **not** in a city. Make sure to save this subset as a new data frame and print the first few rows of the data.
1. Use the `.describe()` function to produce summary statistics for the subset from the previous part. Create a markdown cell and explain:
    - What Series did the `.describe()` function run on? What Series did it not run on? What is the difference, and what does this mean the `.describe()` function is used for?

In [18]:
belgium_stations = train_data[train_data['country'] == 'BE']

# Part b
belgium_stations = belgium_stations[belgium_stations['is_city'] == True]

# Show the subset
print(belgium_stations.head())

# Summary statistics
belgium_stations.describe()

                  name   latitude  longitude  parent_station_id country  \
id                                                                        
5964         Antwerpen  51.221722   4.405860                NaN      BE   
5970    Blandain Ville  50.617431   3.263740                NaN      BE   
5974         Bruxelles  50.846520   4.351739                NaN      BE   
6003       Quevy Ville        NaN        NaN                NaN      BE   
6006  Sterpenich Ville        NaN        NaN                NaN      BE   

            time_zone  is_city  is_main_station  is_airport  
id                                                           
5964  Europe/Brussels     True            False       False  
5970  Europe/Brussels     True            False       False  
5974  Europe/Brussels     True            False       False  
6003  Europe/Brussels     True            False       False  
6006  Europe/Brussels     True            False       False  


Unnamed: 0,latitude,longitude,parent_station_id
count,9.0,9.0,0.0
mean,50.719715,4.810847,
std,0.36682,0.996916,
min,49.999779,3.26374,
25%,50.617431,4.351739,
50%,50.718089,4.444643,
75%,50.979756,5.701847,
max,51.221722,6.120155,


# Part 4: GitHub (10 points)

When you finish the first three parts, give one last restart and run all to this file. Then, go to the [DS 3000 GitHub](https://github.com/eaegerber/ds3000_summer25), fork it and then clone it to your local machine. Then:

- In the forked and cloned ds3000_summer25 repo, make a new branch (`git checkout -b lab-upload`)
- Create a folder with your name (as Dr. Gerber did)
- Navigate to that folder and place this jupyter notebook inside
- Add the file as the change to be committed (`git add .\Lab1_MyName.ipynb`)
- Check `git status`, your changes should be staged
- Commit the changes with a short message (`git commit -m "My message"`)
- Push the repository to GitHub (`git push origin lab-upload`)
- Navigate the the GitHub in the browser and create a pull request
- Once the pull request is merged, you can delete your branch in the "Pull Requests" tab in the browser
- Finally, upload the Lab to Gradescope. When you do so, it will give you the option to do it via GitHub; do that.