# Practical Exercises 
For practicing the numpy skills learned so far 

## 1. Working with API Image Data in Numpy and Matplotlib
- Write a program to download the images from the [Metropolitan Museum of Art API](https://metmuseum.github.io)

- Generate an image (like the one shown below) by plotting random images from the collection

<img src="assets/met_example_image.png" alt="" width="500"/>

In [1]:
import requests
import json
import numpy as np
from PIL import Image

In [2]:
url_objects = "https://collectionapi.metmuseum.org/public/collection/v1/objects"
response = requests.get(url_objects)
data_objects = response.json()
objectIDs_all = data_objects["objectIDs"]
objectIDs_all_count = len(objectIDs_all)
print(f"Number of object-IDs: {objectIDs_all_count}")

Number of object-IDs: 478014


In [3]:
image_counts = (8, 4) # Colums, Rows (corresponds with Width, Height)
max_loops = image_counts[0] * image_counts[1] * 10


In [6]:
image_counter = 0
loop_counter = 0
images = []

print(
    f"Starting Loop. Searching for \
    {image_counts[0]} x {image_counts[1]} = \
    {image_counts[0] * image_counts[1]} images. Loop-max: {max_loops}"
)
print("images: ", end="")

while (image_counter < image_counts[0] * image_counts[1]) and (loop_counter < max_loops):
    loop_counter += 1

    # get random object-ID
    objectID = objectIDs_all[np.random.randint(0, objectIDs_all_count)]

    # get data for object
    url = f"https://collectionapi.metmuseum.org/public/collection/v1/objects/{objectID}"
    response = requests.get(url)
    data = response.json()
    image_url = data["primaryImageSmall"]

    # load image if possible

    if image_url != "":
        im = Image.open(requests.get(image_url, stream=True).raw)
        images.append(im)
        image_counter += 1
        print("|", end="")
        
print(f"Loaded images: {image_counter} of {image_counts[0] * image_counts[1]}")

Starting Loop. Searching for     8 x 4 =     32 images. Loop-max: 320
images: ||||||||||||||||||||||||||||||||Loaded images: 32 of 32


In [6]:
image_size = (500, 500)  # Width, Height
picture_size = (image_counts[0] * image_size[0], image_counts[1] * image_size[1])

pos_x = np.linspace(0, picture_size[0], image_counts[0], endpoint=False)
pos_y = np.linspace(0, picture_size[1], image_counts[1], endpoint=False)
pos = np.array(np.meshgrid(pos_x, pos_y), dtype="int")
coords = [pos[0].ravel(), pos[1].ravel()]

picture = Image.new("RGBA", picture_size)
coords

[array([   0,  500, 1000, 1500, 2000, 2500, 3000, 3500,    0,  500, 1000,
        1500, 2000, 2500, 3000, 3500,    0,  500, 1000, 1500, 2000, 2500,
        3000, 3500,    0,  500, 1000, 1500, 2000, 2500, 3000, 3500]),
 array([   0,    0,    0,    0,    0,    0,    0,    0,  500,  500,  500,
         500,  500,  500,  500,  500, 1000, 1000, 1000, 1000, 1000, 1000,
        1000, 1000, 1500, 1500, 1500, 1500, 1500, 1500, 1500, 1500])]

In [8]:
for i in range(len(images)):
    image = images[i]
    image = image.resize(image_size)
    picture.paste(image, box = (coords[0][i], coords[1][i]))

picture.show()

# 2. Regression factors
The formula for the regression coefficients is

$\beta = (X'X)^{(-1)}X'Y $

But the data is a bit messed up, meaning that the format of the independent variables are saved in a flat array. That means we have a 1xN vector. I.e. the data was changed from that: 

<img src="assets/data_before.png" alt="" width="500"/>

to that:

<img src="assets/data_after.png" alt="" width="700"/>

The array contains the following variables: 

- Sale (in Dollars) - Amount of money received by the store
- Pack Size - Number of bottles per item
- State Bottle Cost - Cost of producing the bottle 
- Packs Sold - Amount of bottles sold
- Bottle Volume (in ml) - How many ml each bottle has



Question: Determine the regression coefficents of the following OLS regression

$Sale = \beta_0 + \beta_1 * (Pack Size) + \beta_2 * (State Bottle Cost) + \beta_3 * (Packs Sold) + \beta_4 * (Bottle Volume) + \epsilon $