## Unit 5 Numpy_Boolean_Selection Quiz
* The objective of this quiz is to understand how to use boolean arrays for selecting subsets from Numpy arrays.

In [1]:
## Import Numpy and check version
import numpy as np
print(f"Numpy version is {np.__version__}")
import pickle
import comp116

Numpy version is 1.24.3


### Boolean arrays are very useful in selecting subsets of arrays.
* Boolean array selection is useful not just for Numpy arays but also Pandas data frames.
* In this quiz you will be working with fuel economy data for selected autmotive vehicle manufacturers for the year 2017.
    * You can get current data similar to this from [here](https://www.fueleconomy.gov/feg/download.shtml).
    * The data you get will need to be cleaned up to get into a format like what we are using in this quiz.
* You will be reading in 3 Numpy arrays, `data_var_names`, `manufacturer_names`, and `mpgData`.
    * `data_var_names` is the array of the type of data that has been read in.
    * `manufacturer_names` is the array of the names of the manufacturer.
    * `mpgData` is the data itself
* You should consider these three arrays as jointly describing a table.
    * `data_var_names` is the first row of the table with headers;
    * `manufacturer_names` is the first column of row labels;
    * `mpgData` is the data itself
    
* When you learn about pandas, you will see that the pandas data frame is an elegant way of keeping these 3 pieces of information in one object, the data frame. For now, we work with numpy arrays.

In [2]:
### Read the pickle file
with open('fuel_economy.data.pickle', 'rb') as fid:
    (data_var_names, manufacturer_names, mpgData) = pickle.load(fid)
comp116.array_to_html(mpgData, row_names=manufacturer_names, col_names=data_var_names,
                      title='2017 Fuel Economy data by Manufacturer')

Unnamed: 0,Real-World Comb MPG,Real-World Comb CO2 g/mi,Weight (lbs),HP,0-60 Time (s)
GM,22.9,388.0,4520.0,265.0,8.0
Toyota,25.3,351.0,4059.0,216.0,8.5
Ford,22.9,388.0,4360.0,262.0,7.9
FCA,21.2,420.0,4510.0,280.0,7.5
Nissan-Mitsubishi,27.1,327.0,3770.0,201.0,8.9
Honda,29.4,302.0,3595.0,203.0,8.1
Hyundai,28.6,311.0,3458.0,176.0,8.9
Subaru,28.5,312.0,3724.0,181.0,9.3
VW,26.5,335.0,3894.0,225.0,7.9
Kia,27.2,327.0,3592.0,186.0,8.8


### For the rest of this quiz, you should use comparisons and boolean values to select subsets. You should not make use of the absolute index of the data.
* For example, you know that `"Real-World Comb MPG"` is in the column with index 0, and `"Ford"` is the row with index 2 in `mpgData`. But do not use `mpgData[2, 0]` to get to the real-world mpg data for Ford. Look at the course lecture on this topic to refresh.

### Compute the list of all manufacturers whose `"Real-World Comb MPG"` was higher than 25.

In [3]:
# Your code here
highMPGBoolArray = mpgData[:,0]>25
highMPGoffset = np.argwhere(highMPGBoolArray)
highMPGmanufacturers = manufacturer_names[highMPGoffset]
print(highMPGmanufacturers)

[['Toyota']
 ['Nissan-Mitsubishi']
 ['Honda']
 ['Hyundai']
 ['Subaru']
 ['VW']
 ['Kia']
 ['BMW']
 ['Mazda']]


### Find the list of all manufacturers whose real-world comb MPG was higher that 25 AND whose 0-60 time is less than 8 seconds

In [4]:
# Your code here
accBoolArray = mpgData[:,4]<8
bothBoolArray= highMPGBoolArray & accBoolArray
bothOffset = np.argwhere(bothBoolArray)
bothManufacturers = manufacturer_names[bothOffset]
print(bothManufacturers)

[['VW']
 ['BMW']]


### What was the HP for GM vehicles?

In [5]:
# Your code here
hpVarBoolArray = data_var_names == "HP"
# print(hpBoolArray)
GMVarBoolArray = manufacturer_names == "GM"
# print(GMBoolArray)
GMHP = int(mpgData[GMVarBoolArray,hpVarBoolArray])

print(f'The horsepower of a GM is {GMHP}HP.')

The horsepower of a GM is 265HP.


### What were the real-world comb MPG and weight for VW?

In [6]:
# Your code here
MPGVarBoolArray = data_var_names == "Real-World Comb MPG"
WeightVarBoolArray = data_var_names == "Weight (lbs)"

VWVarBoolArray = manufacturer_names == "VW"

VWMPG = int(mpgData[VWVarBoolArray,MPGVarBoolArray])
VWWeight = int(mpgData[VWVarBoolArray,WeightVarBoolArray])

print(f'The MPG of a VW is {VWMPG} and the weight is {VWWeight} lbs.')

The MPG of a VW is 26 and the weight is 3894 lbs.


### Create a Numpy array that has the real-world comb MPG, weight and 0-60 time (in that order) for Ford, VW and Honda (in that order)

In [7]:
# Your code here

FordVarBoolArray = manufacturer_names == "Ford"
#VWVarBoolArray = manufacturer_names == "VW"
HondaVarBoolArray = manufacturer_names == "Honda"

#MPGVarBoolArray = data_var_names == "Real-World Comb MPG"
#WeightVarBoolArray = data_var_names == "Weight (lbs)"
accVarBoolArray = data_var_names == "0-60 Time (s)"

carArray = np.array([[int(mpgData[FordVarBoolArray,MPGVarBoolArray]), 
                 int(mpgData[FordVarBoolArray,WeightVarBoolArray]), 
                 int(mpgData[FordVarBoolArray,accVarBoolArray])], 
                [int(mpgData[VWVarBoolArray,MPGVarBoolArray]),
                 int(mpgData[VWVarBoolArray,WeightVarBoolArray]), 
                 int(mpgData[VWVarBoolArray,accVarBoolArray])],
               [int(mpgData[HondaVarBoolArray,MPGVarBoolArray]),
                 int(mpgData[HondaVarBoolArray,WeightVarBoolArray]), 
                 int(mpgData[HondaVarBoolArray,accVarBoolArray])]])

carArray

array([[  22, 4360,    7],
       [  26, 3894,    7],
       [  29, 3595,    8]])