## Practice Questions for numpy
1. Define two custom numpy arrays, say A and B. Generate two new numpy arrays by stacking A and B vertically and horizontally.
2. Find common elements between A and B. [Hint : Intersection of two sets]
3. Extract all numbers from A which are within a specific range. eg between 5 and 10. [Hint: np.where() might be useful or boolean masks]
4. Filter the rows of iris_2d that has petallength (3rd column) > 1.5 and sepallength (1st column) < 5.0
```
url = 'https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data'
iris_2d = np.genfromtxt(url, delimiter=',', dtype='float', usecols=[0,1,2,3])
```

In [4]:
#1. Define two custom numpy arrays, say A and B. Generate two new numpy arrays by stacking A and B vertically and horizontally.
import numpy as np

A = np.array([[1, 2, 3],
              [4, 5, 7]])
B = np.array([[7, 8, 9],
              [10, 11, 12]])

v_stacked = np.vstack((A, B))

h_stacked = np.hstack((A, B))

print("A:\n", A)
print("B:\n", B)
print("Vertically stacked:\n", v_stacked)
print("Horizontally stacked:\n", h_stacked)


A:
 [[1 2 3]
 [4 5 7]]
B:
 [[ 7  8  9]
 [10 11 12]]
Vertically stacked:
 [[ 1  2  3]
 [ 4  5  7]
 [ 7  8  9]
 [10 11 12]]
Horizontally stacked:
 [[ 1  2  3  7  8  9]
 [ 4  5  7 10 11 12]]


In [7]:
#2. Find common elements between A and B. [Hint : Intersection of two sets]
common = np.intersect1d(A, B)
print("Common elements:", common)

Common elements: [7]


In [11]:
#3. Extract all numbers from A which are within a specific range. eg between 5 and 10. [Hint: np.where() might be useful or boolean masks]
indices = np.where((A >= 3) & (A <= 8))
print("Indices:", indices)
print("Values:", A[indices])

Indices: (array([0, 1, 1, 1], dtype=int64), array([2, 0, 1, 2], dtype=int64))
Values: [3 4 5 7]


In [13]:
#4. Filter the rows of iris_2d that has petallength (3rd column) > 1.5 and sepallength (1st column) < 5.0
url = 'https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data'
iris_2d = np.genfromtxt(url, delimiter=',', dtype='float', usecols=[0,1,2,3])

filtered_rows = iris_2d[(iris_2d[:, 2] > 1.5) & (iris_2d[:, 0] < 5.0)]

print("Filtered rows:\n", filtered_rows)

Filtered rows:
 [[4.8 3.4 1.6 0.2]
 [4.8 3.4 1.9 0.2]
 [4.7 3.2 1.6 0.2]
 [4.8 3.1 1.6 0.2]
 [4.9 2.4 3.3 1. ]
 [4.9 2.5 4.5 1.7]]


In [14]:
## Optional Practice Question

#Find the mean of a numeric column grouped by a categorical column in a 2D numpy array

url = 'https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data'
iris = np.genfromtxt(url, delimiter=',', dtype='object')
names = ('sepallength', 'sepalwidth', 'petallength', 'petalwidth', 'species')


numeric_column = iris[:, 1].astype('float')  # sepalwidth
grouping_column = iris[:, 4]  # species

output = []
"""Your code goes here"""

for species in np.unique(grouping_column):
    mean_value = numeric_column[grouping_column == species].mean()
    output.append((species.decode(), mean_value))

output

[('Iris-setosa', 3.418),
 ('Iris-versicolor', 2.7700000000000005),
 ('Iris-virginica', 2.974)]

## Practice Questions for Pandas

1. From df filter the 'Manufacturer', 'Model' and 'Type' for every 20th row starting from 1st (row 0).

```
df = pd.read_csv('https://raw.githubusercontent.com/selva86/datasets/master/Cars93_miss.csv')
```

2. Replace missing values in Min.Price and Max.Price columns with their respective mean.

```
df = pd.read_csv('https://raw.githubusercontent.com/selva86/datasets/master/Cars93_miss.csv')
```

3. How to get the rows of a dataframe with row sum > 100?

```
df = pd.DataFrame(np.random.randint(10, 40, 60).reshape(-1, 4))
```

In [16]:
# 1. From df filter the 'Manufacturer', 'Model' and 'Type' for every 20th row starting from 1st (row 0).
import pandas as pd

df = pd.read_csv('https://raw.githubusercontent.com/selva86/datasets/master/Cars93_miss.csv')

result = df.loc[::20, ['Manufacturer', 'Model', 'Type']]

print(result)


   Manufacturer    Model     Type
0         Acura  Integra    Small
20     Chrysler  LeBaron  Compact
40        Honda  Prelude   Sporty
60      Mercury   Cougar  Midsize
80       Subaru   Loyale    Small


In [18]:
#2. Replace missing values in Min.Price and Max.Price columns with their respective mean.
df = pd.read_csv('https://raw.githubusercontent.com/selva86/datasets/master/Cars93_miss.csv')

print(df[['Min.Price', 'Max.Price']].isna().sum())

df['Min.Price'].fillna(df['Min.Price'].mean(), inplace=True)
df['Max.Price'].fillna(df['Max.Price'].mean(), inplace=True)

print(df[['Min.Price', 'Max.Price']].isna().sum())

Min.Price    7
Max.Price    5
dtype: int64
Min.Price    0
Max.Price    0
dtype: int64


In [23]:
#3. How to get the rows of a dataframe with row sum > 100?
df = pd.DataFrame(np.random.randint(10, 40, 60).reshape(-1, 4))
result = df[df.sum(axis=1) > 100]
print("All data frame:\n", df)
print("Rows with row-sum > 100:\n", result)

All data frame:
      0   1   2   3
0   31  15  11  13
1   33  19  23  31
2   30  27  32  19
3   35  21  16  19
4   14  32  27  26
5   18  29  11  18
6   34  30  17  15
7   26  28  15  34
8   32  16  12  30
9   22  26  39  25
10  18  21  11  33
11  21  20  39  31
12  13  16  30  25
13  12  22  20  36
14  14  10  14  30
Rows with row-sum > 100:
      0   1   2   3
1   33  19  23  31
2   30  27  32  19
7   26  28  15  34
9   22  26  39  25
11  21  20  39  31
