# Graded: 5 of 5 correct
- [x] Compute the list of all manufacturers whose "Real-World Comb MPG" was higher than 25
- [x] Find the list of all manufacturers whose real-world comb MPG was higher that 25 AND whose 0-60 time is less than 8 seconds
- [x] What was the HP for GM vehicles?
- [x] What were the real-world comb MPG and weight for VW?
- [x] Create a Numpy array that has the real-world comb MPG, weight and 0-60 time (in that order) for Ford, VW and Honda (in that order)

Comments:


## Unit 5 Numpy_Boolean_Selection Quiz
* The objective of this quiz is to understand how to use boolean arrays for selecting subsets from Numpy arrays.

In [74]:
## Import Numpy and check version
import numpy as np
print(f"Numpy version is {np.__version__}")
import pickle
import comp116
import pandas as pd

Numpy version is 1.26.4


### Boolean arrays are very useful in selecting subsets of arrays.
* Boolean array selection is useful not just for Numpy arays but also Pandas data frames.
* In this quiz you will be working with fuel economy data for selected autmotive vehicle manufacturers for the year 2017.
    * You can get current data similar to this from [here](https://www.fueleconomy.gov/feg/download.shtml).
    * The data you get will need to be cleaned up to get into a format like what we are using in this quiz.
* You will be reading in 3 Numpy arrays, `data_var_names`, `manufacturer_names`, and `mpgData`.
    * `data_var_names` is the array of the type of data that has been read in.
    * `manufacturer_names` is the array of the names of the manufacturer.
    * `mpgData` is the data itself
* You should consider these three arrays as jointly describing a table.
    * `data_var_names` is the first row of the table with headers;
    * `manufacturer_names` is the first column of row labels;
    * `mpgData` is the data itself
    
* When you learn about pandas, you will see that the pandas data frame is an elegant way of keeping these 3 pieces of information in one object, the data frame. For now, we work with numpy arrays.

In [3]:
### Read the pickle file
with open('fuel_economy.data.pickle', 'rb') as fid:
    (data_var_names, manufacturer_names, mpgData) = pickle.load(fid)
comp116.array_to_html(mpgData, row_names=manufacturer_names, col_names=data_var_names,
                      title='2017 Fuel Economy data by Manufacturer')

Unnamed: 0,Real-World Comb MPG,Real-World Comb CO2 g/mi,Weight (lbs),HP,0-60 Time (s)
GM,22.9,388.0,4520.0,265.0,8.0
Toyota,25.3,351.0,4059.0,216.0,8.5
Ford,22.9,388.0,4360.0,262.0,7.9
FCA,21.2,420.0,4510.0,280.0,7.5
Nissan-Mitsubishi,27.1,327.0,3770.0,201.0,8.9
Honda,29.4,302.0,3595.0,203.0,8.1
Hyundai,28.6,311.0,3458.0,176.0,8.9
Subaru,28.5,312.0,3724.0,181.0,9.3
VW,26.5,335.0,3894.0,225.0,7.9
Kia,27.2,327.0,3592.0,186.0,8.8


***
### For the rest of this quiz, you should use comparisons and boolean values to select subsets. You should not make use of the absolute index of the data.
* For example, you know that `"Real-World Comb MPG"` is in the column with index 0, and `"Ford"` is the row with index 2 in `mpgData`. But do not use `mpgData[2, 0]` to get to the real-world mpg data for Ford. Look at the course lecture on this topic to refresh.

### Compute the list of all manufacturers whose `"Real-World Comb MPG"` was higher than 25.

In [28]:
# Your code here
col_boolean = (data_var_names == "Real-World Comb MPG")
col_offset = np.argmax(col_boolean)
print("\nCol Offset:", col_offset)

mpg_boolean = (mpgData[:, col_offset] > 25)
print("\nMPG Boolean:",mpg_boolean)

print(f'\nList of manufacturers > 25 RWC MPG: {manufacturer_names[mpg_boolean]}')



 Col Offset: 0

MPG Boolean: [False  True False False  True  True  True  True  True  True False  True
  True]

List of manufacturers > 25 RWC MPG: ['Toyota' 'Nissan-Mitsubishi' 'Honda' 'Hyundai' 'Subaru' 'VW' 'Kia' 'BMW'
 'Mazda']


### Find the list of all manufacturers whose real-world comb MPG was higher that 25 AND whose 0-60 time is less than 8 seconds

In [32]:
# Your code here
# Grab column for MPG
col_boolean_1 = (data_var_names == "Real-World Comb MPG")
col_offset_1 = np.argmax(col_boolean_1)

# Grab column for 0-60 time
col_boolean_2 = (data_var_names == "0-60 Time (s)")
col_offset_2 = np.argmax(col_boolean_2)

print(f"\nMPG Offset: {col_offset_1}\n0-60 Offset: {col_offset_2}")

# Logic for MPG
mpg_boolean = (mpgData[:, col_offset_1] > 25)

# Logic for 0-60 
time_boolean = (mpgData[:, col_offset_2] < 8)

print(f"\nMPG Boolean: {mpg_boolean}\n0-60 Boolean: {time_boolean}")

# Create mask for both requirements
mask_boolean = mpg_boolean & time_boolean

print(f'\nList of manufacturers where RWC MPG > 25 and 0-60 time < 8s: {manufacturer_names[mask_boolean]}')


MPG Offset: 0
0-60 Offset: 4

MPG Boolean: [False  True False False  True  True  True  True  True  True False  True
  True]
0-60 Boolean: [False False  True  True False False False False  True False  True  True
 False]

List of manufacturers where RWC MPG > 25 and 0-60 time < 8s: ['VW' 'BMW']


### What was the HP for GM vehicles?

In [40]:
# Your code here
GM_vehicles = (manufacturer_names == "GM")
HP_offset = np.argmax(data_var_names == "HP")

print(f'The HP for GM vehicles: {mpgData[GM_vehicles, HP_offset]}')

The HP for GM vehicles: [265.]


### What were the real-world comb MPG and weight for VW?

In [43]:
# Your code here

# get indexes
VW_vehicles = (manufacturer_names == "VW")
weight_offset = np.argmax(data_var_names == "Weight (lbs)")
mpg_offset = col_offset_1 # from previous

print(f'Real World Comb MPG for VW: {mpgData[VW_vehicles, mpg_offset]}\nWeight for VW: {mpgData[VW_vehicles, weight_offset]}')

Real World Comb MPG for VW: [26.5]
Weight for VW: [3894.]


***

### Create a Numpy array that has the real-world comb MPG, weight and 0-60 time (in that order) for Ford, VW and Honda (in that order)

In [73]:
# Your code here

# Create indexes for Ford & Honda / copy over VW
Ford_vehicles = (manufacturer_names == "Ford")
VW_vehicles = (manufacturer_names == "VW")
Honda_vehicles = (manufacturer_names == "Honda")

#  Index necessary columns
mpg_offset = np.argmax(data_var_names == "Real-World Comb MPG") # from previous
weight_offset = np.argmax(data_var_names == "Weight (lbs)")
time_offset = np.argmax(data_var_names == "0-60 Time (s)")

# Create array 
output_array = np.array([
    [
        manufacturer_names[Ford_vehicles], 
        mpgData[Ford_vehicles, mpg_offset], 
        mpgData[Ford_vehicles, weight_offset], 
        mpgData[Ford_vehicles, time_offset]
    ],
    [
        manufacturer_names[VW_vehicles], 
        mpgData[VW_vehicles, mpg_offset], 
        mpgData[VW_vehicles, weight_offset], 
        mpgData[VW_vehicles, time_offset]
    ],
    [
        manufacturer_names[Honda_vehicles], 
        mpgData[Honda_vehicles, mpg_offset], 
        mpgData[Honda_vehicles, weight_offset], 
        mpgData[Honda_vehicles, time_offset]
    ]
])

# Print
print(output_array)



[[['Ford']
  ['22.9']
  ['4360.0']
  ['7.9']]

 [['VW']
  ['26.5']
  ['3894.0']
  ['7.9']]

 [['Honda']
  ['29.4']
  ['3595.0']
  ['8.1']]]


***
#### Testing

Attempting to build an output table

In [90]:
row_names = [
    manufacturer_names[Ford_vehicles].item(), 
    manufacturer_names[VW_vehicles].item(), 
    manufacturer_names[Honda_vehicles].item()
    ]

col_names = [
    data_var_names[mpg_offset], 
    data_var_names[weight_offset], 
    data_var_names[time_offset]
    ]

data_table = [
    [
        mpgData[Ford_vehicles, mpg_offset].item(), 
        mpgData[Ford_vehicles, weight_offset].item(), 
        mpgData[Ford_vehicles, time_offset].item()
    ],
    [
        mpgData[VW_vehicles, mpg_offset].item(), 
        mpgData[VW_vehicles, weight_offset].item(), 
        mpgData[VW_vehicles, time_offset].item()
    ],
    [
        mpgData[Honda_vehicles, mpg_offset].item(), 
        mpgData[Honda_vehicles, weight_offset].item(), 
        mpgData[Honda_vehicles, time_offset].item()
    ]
]

data_table = np.array(data_table)

comp116.array_to_html(data_table, row_names=row_names, col_names=col_names,
                     title='2017 Fuel Economy data (RWC MPG, Weight, 0-60 Time) by Manufacturer (Ford, VW, Honda)')

Unnamed: 0,Real-World Comb MPG,Weight (lbs),0-60 Time (s)
Ford,22.9,4360.0,7.9
VW,26.5,3894.0,7.9
Honda,29.4,3595.0,8.1
