# Recommender Systems

Lets begin to understand this cool concept that will allow us to provide recommendations about various products and services to users based on their data

Source: https://www.linkedin.com/learning/building-a-recommendation-system-with-python-machine-learning-ai/popularity-based-recommenders

#### Simple Approaches to Recommender Systems

# Popularity Based Recommenders
Remember that the whole concept behind popularity based recommender systems focuses on count statistics. Ex: Product A receiving more ratings (above threshold) than product B, therefore product a is popular/good - on a basic level

In [1]:
import pandas as pd 
import numpy as np

In [2]:
# Reading in our datasets 
frame = pd.read_csv('C:/Users/Darshil/Desktop/Dreams/RecSystems/Ex_Files_Intro_Python_Rec_Systems/Exercise Files/01_02/rating_final.csv')
cuisine = pd.read_csv('C:/Users/Darshil/Desktop/Dreams/RecSystems/Ex_Files_Intro_Python_Rec_Systems/Exercise Files/01_02/chefmozcuisine.csv')

#### Lets take a look at our datasets

##### frame
Features comprise
- user ID
- place ID
- ratings, food and service rating
<br>
Basically, a user can rate multiple places on good and service

##### cuisine
Features comprise
- place id, cuisine name : Basically what type of cuisines are there in each place

In [3]:
"""Looking at our datasets"""
frame.head(5)

Unnamed: 0,userID,placeID,rating,food_rating,service_rating
0,U1077,135085,2,2,2
1,U1077,135038,2,2,1
2,U1077,132825,2,2,2
3,U1077,135060,1,2,2
4,U1068,135104,1,1,2


In [4]:
cuisine.head(5)

Unnamed: 0,placeID,Rcuisine
0,135110,Spanish
1,135109,Italian
2,135107,Latin_American
3,135106,Mexican
4,135105,Fast_Food


In [5]:
rating_count =  pd.DataFrame(frame.groupby('placeID')['rating'].count()).sort_values('rating', ascending=False)
rating_count.head()

Unnamed: 0_level_0,rating
placeID,Unnamed: 1_level_1
135085,36
132825,32
135032,28
135052,25
132834,25


In [12]:
most_rated_places = pd.DataFrame([135085,132825, 135032,135052, 132834], index = np.arange(5), columns = ['placeID'])

"""Here we basically merge so as to see which (all) cuisines the most popular places are serving
- The idea is that having these cuisines might affect the overall rating of the place!
"""
summary = pd.merge(most_rated_places, cuisine, on = 'placeID')
summary

Unnamed: 0,placeID,Rcuisine
0,135085,Fast_Food
1,132825,Mexican
2,135032,Cafeteria
3,135032,Contemporary
4,135052,Bar
5,135052,Bar_Pub_Brewery
6,132834,Mexican


# Correlation- Based Recommendations
<br>
Now lets dive into making recommendations based on correlations. Before we do that, lets do some more groupbys 

In [10]:
# Reading in our datasets 
frame = pd.read_csv('C:/Users/Darshil/Desktop/Dreams/RecSystems/Ex_Files_Intro_Python_Rec_Systems/Exercise Files/01_02/rating_final.csv')
cuisine = pd.read_csv('C:/Users/Darshil/Desktop/Dreams/RecSystems/Ex_Files_Intro_Python_Rec_Systems/Exercise Files/01_02/chefmozcuisine.csv')
geodata = pd.read_csv('C:/Users/Darshil/Desktop/Dreams/RecSystems/Ex_Files_Intro_Python_Rec_Systems/Exercise Files/01_03/geoplaces2.csv')

In [15]:
places = geodata[['placeID', 'name']]
places.head()

Unnamed: 0,placeID,name
0,134999,Kiku Cuernavaca
1,132825,puesto de tacos
2,135106,El Rincón de San Francisco
3,132667,little pizza Emilio Portes Gil
4,132613,carnitas_mata


In [30]:
"""
Group By Metrics
- PlaceID- groupby ratings: mean and count!
- we add the 'count' column to take into count the count
- We call describe to get stats
"""
rating_mean = pd.DataFrame(frame.groupby('placeID')['rating'].mean())
rating_mean['count']= pd.DataFrame(frame.groupby('placeID')['rating'].count())
rating_mean.describe()

Unnamed: 0,rating,count
count,130.0,130.0
mean,1.179622,8.930769
std,0.349354,6.124279
min,0.25,3.0
25%,1.0,5.0
50%,1.181818,7.0
75%,1.4,11.0
max,2.0,36.0


In [28]:
# Lets take a look at our new df
rating_mean.sort_values('count', ascending=False).head()

# Lets get the name 
places[places['placeID'] == 135085]

Unnamed: 0,placeID,name
121,135085,Tortas Locas Hipocampo


###  Creating a pivot table
<br>
What we do here is compare the users to the placeIDs and in the matrix we have the ratings
<br>
Ex: UserID....1 and the column is a PlaceID

In [58]:
places_crosstab = pd.pivot_table(data = frame, values='rating', index = 'userID', columns = 'placeID')
places_crosstab.head()

placeID,132560,132561,132564,132572,132583,132584,132594,132608,132609,132613,...,135080,135081,135082,135085,135086,135088,135104,135106,135108,135109
userID,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
U1001,,,,,,,,,,,...,,,,0.0,,,,,,
U1002,,,,,,,,,,,...,,,,1.0,,,,1.0,,
U1003,,,,,,,,,,,...,2.0,,,,,,,,,
U1004,,,,,,,,,,,...,,,,,,,,2.0,,
U1005,,,,,,,,,,,...,,,,,,,,,,


In [69]:
# Now we want to see for places x all the users and their ratings
tortas_ratings = places_crosstab[135085]

# Ratings above or equal to 0
tortas_ratings[tortas_ratings>=0]
tortas_ratings.head()

userID
U1001    0.0
U1002    1.0
U1003    NaN
U1004    NaN
U1005    NaN
Name: 135085, dtype: float64

###  Evaluating similarity based on correlation
<br>
- We pick a place with ratings from different users
- Then we correlate that place with all other places
- Based on the R values, we learn how correlated user ratings are for diff places
- Therefore: we can recommend other places with high R values

In [74]:
"""
In this example, we see how well each places relates with Tortas
"""
similar_to_tortas = places_crosstab.corrwith(tortas_ratings)
corr_tortas = pd.DataFrame(similar_to_tortas, columns = ['PearsonR'])
corr_tortas.dropna(inplace=True)
corr_tortas.head()

  c = cov(x, y, rowvar)
  c *= 1. / np.float64(fact)


Unnamed: 0_level_0,PearsonR
placeID,Unnamed: 1_level_1
132572,-0.428571
132723,0.301511
132754,0.930261
132825,0.700745
132834,0.814823


In [81]:
# We join the count column and take out all places with count less than 10
tortas_corr_summary = corr_tortas.join(rating_mean['count'])
tortas_corr_summary[tortas_corr_summary['count']>10].sort_values('PearsonR', ascending=False).head(10)

Unnamed: 0_level_0,PearsonR,count
placeID,Unnamed: 1_level_1,Unnamed: 2_level_1
135085,1.0,36
135076,1.0,13
135066,1.0,12
132754,0.930261,13
135045,0.912871,13
135062,0.898933,21
135028,0.892218,15
135042,0.881409,20
135046,0.867722,11
132872,0.840168,12


In [91]:
# Based on the table above, we pick the first x places and recommend those especially of the same type
places_corr_tortas = pd.DataFrame([135085,135754,135045,135062,135028,135042,135046],index=np.arange(7),columns=['placeID'])
summary = pd.merge(places_corr_tortas,cuisine, on = 'placeID')
summary

Unnamed: 0,placeID,Rcuisine
0,135085,Fast_Food
1,135028,Mexican
2,135042,Chinese
3,135046,Fast_Food


In [87]:
# As you can see this is the other fast food place that can be recommended!
places[places['placeID']== 135046]

# Statistics
cuisine['Rcuisine'].describe()

Unnamed: 0,placeID,name
42,135046,Restaurante El Reyecito


# MACHINE LEARNING BASED RECOMMENDATION SYSTEMS

## Classification based collaborative filtering

In [334]:
import numpy as np
import pandas as pd

from sklearn.linear_model import LogisticRegression
from sklearn.metrics import classification_report

In [357]:
# Reading our data in. Note that the categorical columns were converted using pandas dummy
# Note that everything from "y binary to divorced" are dummy variables converting from categorical data
bank_full = pd.read_csv('C:/Users/Darshil/Desktop/Dreams/RecSystems/Ex_Files_Intro_Python_Rec_Systems/Exercise Files/02_01/bank_full_w_dummy_vars.csv')
bank_full.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 45211 entries, 0 to 45210
Data columns (total 37 columns):
age                             45211 non-null int64
job                             45211 non-null object
marital                         45211 non-null object
education                       45211 non-null object
default                         45211 non-null object
balance                         45211 non-null int64
housing                         45211 non-null object
loan                            45211 non-null object
contact                         45211 non-null object
day                             45211 non-null int64
month                           45211 non-null object
duration                        45211 non-null int64
campaign                        45211 non-null int64
pdays                           45211 non-null int64
previous                        45211 non-null int64
poutcome                        45211 non-null object
y                               45

##### Using Scikitlearn

In [348]:
# Lets get our X,Y data from our dataset

X = bank_full.iloc[:,18:37].values
y = bank_full.iloc[:,17].values

In [353]:
# Using Scikit learn
LogReg = LogisticRegression()
LogReg.fit(X,y)

#Predict using by creating a single test point
new_user = np.array([0,0,1,0,1,1,0,0,0,0,0,0,0,0,0,0,0,1,1]).reshape(1,-1)
Y_PRED_test = LogReg.predict(new_user)
Y_PRED_test

array([1], dtype=int64)

In [356]:
# Evaluating in SCIKIT learn. We fit and then print report
Y_PRED = LogReg.predict(X)
print (classification_report(y,Y_PRED))


             precision    recall  f1-score   support

          0       0.90      0.99      0.94     39922
          1       0.67      0.17      0.27      5289

avg / total       0.87      0.89      0.86     45211



###### Using Keras

In [198]:
import keras
from keras.datasets import mnist 
from keras.models import Sequential
from keras.layers import Dense
from keras.optimizers import RMSprop

model = Sequential()
model.add(Dense(512, activation = 'relu', input_shape=(19,)))
model.add(Dense(2, activation = 'softmax'))
keras.optimizers.RMSprop(lr=0.001, rho=0.9, epsilon=None, decay=0.0)

model.compile(loss = 'sparse_categorical_crossentropy', 
              optimizer = RMSprop(),
              metrics  = ['accuracy'])

history  = model.fit(X, y, 
                     batch_size=30,
                     epochs=30,
                     verbose=2)

Epoch 1/30
 - 3s - loss: 0.3271 - acc: 0.8920
Epoch 2/30
 - 2s - loss: 0.3245 - acc: 0.8927
Epoch 3/30
 - 2s - loss: 0.3233 - acc: 0.8928
Epoch 4/30
 - 2s - loss: 0.3232 - acc: 0.8925
Epoch 5/30
 - 2s - loss: 0.3226 - acc: 0.8929
Epoch 6/30
 - 2s - loss: 0.3225 - acc: 0.8927
Epoch 7/30
 - 2s - loss: 0.3224 - acc: 0.8925
Epoch 8/30
 - 2s - loss: 0.3222 - acc: 0.8925
Epoch 9/30
 - 2s - loss: 0.3222 - acc: 0.8923
Epoch 10/30
 - 2s - loss: 0.3224 - acc: 0.8926
Epoch 11/30
 - 2s - loss: 0.3218 - acc: 0.8924
Epoch 12/30
 - 2s - loss: 0.3219 - acc: 0.8930
Epoch 13/30
 - 2s - loss: 0.3221 - acc: 0.8926
Epoch 14/30
 - 2s - loss: 0.3218 - acc: 0.8925
Epoch 15/30
 - 2s - loss: 0.3217 - acc: 0.8926
Epoch 16/30
 - 2s - loss: 0.3216 - acc: 0.8930
Epoch 17/30
 - 3s - loss: 0.3212 - acc: 0.8927
Epoch 18/30
 - 3s - loss: 0.3217 - acc: 0.8928
Epoch 19/30
 - 3s - loss: 0.3213 - acc: 0.8931
Epoch 20/30
 - 3s - loss: 0.3217 - acc: 0.8930
Epoch 21/30
 - 2s - loss: 0.3215 - acc: 0.8929
Epoch 22/30
 - 2s - lo

In [194]:
new_user = np.array([1,0,1,0,1,1,0,1,0,0,0,0,0,0,0,0,0,1,1]).reshape(1,-1)
model.predict(new_user).argmax()

0

## Model based collaborative filtering
### SVD Matrix Factorization

In [199]:
import pandas as pd
import numpy as np
import sklearn
from sklearn.decomposition import TruncatedSVD

In [206]:
# Importing data on users and their ratings on movies
frame = pd.read_csv('C:/Users/Darshil/Desktop/Dreams/RecSystems/Ex_Files_Intro_Python_Rec_Systems/Exercise Files/02_02/ml-100k/u.data', sep = '\t', names = ['user_id', 'item_id', 'rating', 'timestamp'])
frame.head()

Unnamed: 0,user_id,item_id,rating,timestamp
0,196,242,3,881250949
1,186,302,3,891717742
2,22,377,1,878887116
3,244,51,2,880606923
4,166,346,1,886397596


In [208]:
movies = pd.read_csv('C:/Users/Darshil/Desktop/Dreams/RecSystems/Ex_Files_Intro_Python_Rec_Systems/Exercise Files/02_02/ml-100k/u.item', sep = '|', names = ['item_id','movie_title', 'release_data', 
'video_release_data','IMdb URL', 'unknown', 'Action', 'Adventure', 'Animation', 'Childrens', 'Comedy', 'Crime','Documentary', 
'Drama', 'Fantasy', 'Film-Noir', 'Horror', 'Musical', 'Mystery', 'Romance', 'Sci-Fi','Thriller', 'War', 'Western' ])

In [209]:
movies.head()

Unnamed: 0,item_id,movie_title,release_data,video_release_data,IMdb URL,unknown,Action,Adventure,Animation,Childrens,...,Fantasy,Film-Noir,Horror,Musical,Mystery,Romance,Sci-Fi,Thriller,War,Western
0,1,Toy Story (1995),01-Jan-1995,,http://us.imdb.com/M/title-exact?Toy%20Story%2...,0,0,0,1,1,...,0,0,0,0,0,0,0,0,0,0
1,2,GoldenEye (1995),01-Jan-1995,,http://us.imdb.com/M/title-exact?GoldenEye%20(...,0,1,1,0,0,...,0,0,0,0,0,0,0,1,0,0
2,3,Four Rooms (1995),01-Jan-1995,,http://us.imdb.com/M/title-exact?Four%20Rooms%...,0,0,0,0,0,...,0,0,0,0,0,0,0,1,0,0
3,4,Get Shorty (1995),01-Jan-1995,,http://us.imdb.com/M/title-exact?Get%20Shorty%...,0,1,0,0,0,...,0,0,0,0,0,0,0,0,0,0
4,5,Copycat (1995),01-Jan-1995,,http://us.imdb.com/M/title-exact?Copycat%20(1995),0,0,0,0,0,...,0,0,0,0,0,0,0,1,0,0


In [212]:
combined_movies_data = pd.merge(frame, movies, on='item_id')
combined_movies_data

Unnamed: 0,user_id,item_id,rating,timestamp,movie_title,release_data,video_release_data,IMdb URL,unknown,Action,...,Fantasy,Film-Noir,Horror,Musical,Mystery,Romance,Sci-Fi,Thriller,War,Western
0,196,242,3,881250949,Kolya (1996),24-Jan-1997,,http://us.imdb.com/M/title-exact?Kolya%20(1996),0,0,...,0,0,0,0,0,0,0,0,0,0
1,63,242,3,875747190,Kolya (1996),24-Jan-1997,,http://us.imdb.com/M/title-exact?Kolya%20(1996),0,0,...,0,0,0,0,0,0,0,0,0,0
2,226,242,5,883888671,Kolya (1996),24-Jan-1997,,http://us.imdb.com/M/title-exact?Kolya%20(1996),0,0,...,0,0,0,0,0,0,0,0,0,0
3,154,242,3,879138235,Kolya (1996),24-Jan-1997,,http://us.imdb.com/M/title-exact?Kolya%20(1996),0,0,...,0,0,0,0,0,0,0,0,0,0
4,306,242,5,876503793,Kolya (1996),24-Jan-1997,,http://us.imdb.com/M/title-exact?Kolya%20(1996),0,0,...,0,0,0,0,0,0,0,0,0,0
5,296,242,4,884196057,Kolya (1996),24-Jan-1997,,http://us.imdb.com/M/title-exact?Kolya%20(1996),0,0,...,0,0,0,0,0,0,0,0,0,0
6,34,242,5,888601628,Kolya (1996),24-Jan-1997,,http://us.imdb.com/M/title-exact?Kolya%20(1996),0,0,...,0,0,0,0,0,0,0,0,0,0
7,271,242,4,885844495,Kolya (1996),24-Jan-1997,,http://us.imdb.com/M/title-exact?Kolya%20(1996),0,0,...,0,0,0,0,0,0,0,0,0,0
8,201,242,4,884110598,Kolya (1996),24-Jan-1997,,http://us.imdb.com/M/title-exact?Kolya%20(1996),0,0,...,0,0,0,0,0,0,0,0,0,0
9,209,242,4,883589606,Kolya (1996),24-Jan-1997,,http://us.imdb.com/M/title-exact?Kolya%20(1996),0,0,...,0,0,0,0,0,0,0,0,0,0


In [220]:
# We use groupby metrics here
combined_movies_data.groupby('item_id')['rating'].count().sort_values(ascending=False).head()

# Here we simply want to use itemID = 50 since that is the one that got the most ratings (count)
combined_movies_data[combined_movies_data['item_id']==50]['movie_title'].unique()

array(['Star Wars (1977)'], dtype=object)

In [285]:
"""
Building a utility matrix
"""
# places_crosstab = pd.pivot_table(data = frame, values='rating', index = 'userID', columns = 'placeID')

rating_crosstab = pd.pivot_table(data = combined_movies_data,values='rating', index = 'user_id', columns = 'movie_title', fill_value=0)
rating_crosstab.head()

movie_title,'Til There Was You (1997),1-900 (1994),101 Dalmatians (1996),12 Angry Men (1957),187 (1997),2 Days in the Valley (1996),"20,000 Leagues Under the Sea (1954)",2001: A Space Odyssey (1968),3 Ninjas: High Noon At Mega Mountain (1998),"39 Steps, The (1935)",...,Yankee Zulu (1994),Year of the Horse (1997),You So Crazy (1994),Young Frankenstein (1974),Young Guns (1988),Young Guns II (1990),"Young Poisoner's Handbook, The (1995)",Zeus and Roxanne (1997),unknown,Á köldum klaka (Cold Fever) (1994)
user_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
1,0,0,2,5,0,0,3,4,0,0,...,0,0,0,5,3,0,0,0,4,0
2,0,0,0,0,0,0,0,0,1,0,...,0,0,0,0,0,0,0,0,0,0
3,0,0,0,0,2,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
4,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
5,0,0,2,0,0,0,0,4,0,0,...,0,0,0,4,0,0,0,0,4,0


In [242]:
# Now we transpose the matrix using .T. So now the rows are movie titles (1664) and columns are users 
X = rating_crosstab.values.T
X.shape

(1664, 943)

In [268]:
# Then we decompose it
SVD = TruncatedSVD(n_components=12, random_state=17)
resultant_matrix = SVD.fit_transform(X)
resultant_matrix.shape

(1664, 12)

In [269]:
# Now we generate a correlation matrix
corr_mat = np.corrcoef(resultant_matrix)
corr_mat.shape

(1664, 1664)

In [277]:
# Now we isolate star wars
movies_names = rating_crosstab.columns
movies_list = list(movies_names)

# finding numeric index value of star wars
star_wars=movies_list.index('Star Wars (1977)')
star_wars

1398

In [295]:
# pulling star wars data
corr_star_wars = corr_mat[star_wars]

In [296]:
# Now we recommend movies
# list(movies_names[(corr_star_wars < 1.0) & (corr_star_wars > 0.9)])
# movies_names
list(movies_names[(corr_star_wars < 1.0) & (corr_star_wars > 0.90)])
corr_mat.shape

(1664, 1664)

## Content based collaborative filtering
### Nearest Neighbours

In [1]:
import pandas as pd
import numpy as np

import sklearn
from sklearn.neighbors import NearestNeighbors

In [31]:
# reading our data
cars = pd.read_csv('C:/Users/Darshil/Desktop/Dreams/RecSystems/Ex_Files_Intro_Python_Rec_Systems/Exercise Files/02_03/mtcars.csv')
cars.columns = ['car_names', 'mpg', 'cyl', 'disp', 'hp', 'drat', 'weight', 'qsec', 'vs','am', 'gear', 'carb']

cars.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 32 entries, 0 to 31
Data columns (total 12 columns):
car_names    32 non-null object
mpg          32 non-null float64
cyl          32 non-null int64
disp         32 non-null float64
hp           32 non-null int64
drat         32 non-null float64
weight       32 non-null float64
qsec         32 non-null float64
vs           32 non-null int64
am           32 non-null int64
gear         32 non-null int64
carb         32 non-null int64
dtypes: float64(5), int64(6), object(1)
memory usage: 3.1+ KB


In [58]:
# creating our test point
t = [15,300,160,3.2]

# Creating our dataset. We pick: mpg, display size,  horse power, weight - we dont use all our columns
X = cars.iloc[:,[1,3,4,6]].values
X[0:5]

array([[ 21.   , 160.   , 110.   ,   2.62 ],
       [ 21.   , 160.   , 110.   ,   2.875],
       [ 22.8  , 108.   ,  93.   ,   2.32 ],
       [ 21.4  , 258.   , 110.   ,   3.215],
       [ 18.7  , 360.   , 175.   ,   3.44 ]])

In [59]:
# Now we calculate nearest neighbours
nbrs = NearestNeighbors(n_neighbors=3).fit(X)

# now we test it out. Note it returns length, and index position
pred= nbrs.kneighbors([t])
list(pred[1][0])

[22, 21, 13]

In [60]:
index_pred = pred[1]
index_pred[0]

array([22, 21, 13], dtype=int64)

## Saving our KNN Model to deploy in Flask

<br> Lets also look into loading it back up and test it out

In [61]:
from sklearn.externals import joblib
joblib.dump(nbrs,'nearest_n_3')

['nearest_n_3']

In [63]:
nj = joblib.load('C:/Users/Darshil/gitly/Deep-Learning/My Projects/Flask_Keras/saved_models/nearest_n_3')

In [64]:
nj.kneighbors([[15,300,160,3.2]])

(array([[10.77474942, 20.59981553, 31.40089808]]),
 array([[22, 21, 13]], dtype=int64))

In [65]:
#prediction
pred = nj.kneighbors([[15,300,160,3.2]])
#getting the index of the similar items
index_pred = int(pred[1][0][0])

#getting similar item from dataset
similar = cars.iloc[[2]]

#iterating here to show the recommendations as a list in Flask
for i,x in similar.iterrows():
    print (x['mpg'])

22.8


In [69]:
cars.iloc[list(pred[1][0])]

Unnamed: 0,car_names,mpg,cyl,disp,hp,drat,weight,qsec,vs,am,gear,carb
22,AMC Javelin,15.2,8,304.0,150,3.15,3.435,17.3,0,0,3,2
21,Dodge Challenger,15.5,8,318.0,150,2.76,3.52,16.87,0,0,3,2
13,Merc 450SLC,15.2,8,275.8,180,3.07,3.78,18.0,0,0,3,3


In [39]:
#selecting rows of a dataframe
cars.iloc[[0,4]]

Unnamed: 0,car_names,mpg,cyl,disp,hp,drat,weight,qsec,vs,am,gear,carb
0,Mazda RX4,21.0,6,160.0,110,3.9,2.62,16.46,0,1,4,4
4,Hornet Sportabout,18.7,8,360.0,175,3.15,3.44,17.02,0,0,3,2
