In [3]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


## Hospital General Information Dataset

### By: Anurag Bolneni & Ridima Bhatt

This notebook has our preliminary hospital reccomendation system MVP based on the Hospital General Information dataset from CMS. It is split into two sections:
+ Step 1: Data Cleaning & Manipulation
+ Step 2: Taking User Input
+ Step 3: Reccomendation System MVP

## Step 1: Data Cleaning & Manipulation

We first import necessary libraries and use an API to pull data from CMS websites. Then, data is fed into a pandas dataframe where we clean for parameters of interest and data types of different columns.

In [1]:
import pandas as pd
from collections import Counter
import numpy as np

In [6]:
pd.read_csv('/content/drive/MyDrive/siads697/data/Hospital_General_Information.csv').head()

Unnamed: 0,Facility ID,Facility Name,Address,City,State,ZIP Code,County Name,Phone Number,Hospital Type,Hospital Ownership,...,Count of READM Measures Better,Count of READM Measures No Different,Count of READM Measures Worse,READM Group Footnote,Pt Exp Group Measure Count,Count of Facility Pt Exp Measures,Pt Exp Group Footnote,TE Group Measure Count,Count of Facility TE Measures,TE Group Footnote
0,10001,SOUTHEAST HEALTH MEDICAL CENTER,1108 ROSS CLARK CIRCLE,DOTHAN,AL,36301,HOUSTON,(334) 793-8701,Acute Care Hospitals,Government - Hospital District or Authority,...,1,9,1,,8,8,,14,11,
1,10005,MARSHALL MEDICAL CENTERS,2505 U S HIGHWAY 431 NORTH,BOAZ,AL,35957,MARSHALL,(256) 593-8310,Acute Care Hospitals,Government - Hospital District or Authority,...,0,9,1,,8,8,,14,14,
2,10006,NORTH ALABAMA MEDICAL CENTER,1701 VETERANS DRIVE,FLORENCE,AL,35630,LAUDERDALE,(256) 768-8400,Acute Care Hospitals,Proprietary,...,1,7,1,,8,8,,14,11,
3,10007,MIZELL MEMORIAL HOSPITAL,702 N MAIN ST,OPP,AL,36467,COVINGTON,(334) 493-3541,Acute Care Hospitals,Voluntary non-profit - Private,...,0,6,0,,8,8,,14,7,
4,10008,CRENSHAW COMMUNITY HOSPITAL,101 HOSPITAL CIRCLE,LUVERNE,AL,36049,CRENSHAW,(334) 335-3374,Acute Care Hospitals,Proprietary,...,0,4,0,,8,Not Available,5.0,14,8,


In [7]:
df_hosp_gen_info = pd.read_csv('/content/drive/MyDrive/siads697/data/Hospital_General_Information.csv').iloc[:,:13].drop(columns=['Phone Number','Meets criteria for promoting interoperability of EHRs'])
df_hosp_gen_info.head()

Unnamed: 0,Facility ID,Facility Name,Address,City,State,ZIP Code,County Name,Hospital Type,Hospital Ownership,Emergency Services,Hospital overall rating
0,10001,SOUTHEAST HEALTH MEDICAL CENTER,1108 ROSS CLARK CIRCLE,DOTHAN,AL,36301,HOUSTON,Acute Care Hospitals,Government - Hospital District or Authority,Yes,3
1,10005,MARSHALL MEDICAL CENTERS,2505 U S HIGHWAY 431 NORTH,BOAZ,AL,35957,MARSHALL,Acute Care Hospitals,Government - Hospital District or Authority,Yes,2
2,10006,NORTH ALABAMA MEDICAL CENTER,1701 VETERANS DRIVE,FLORENCE,AL,35630,LAUDERDALE,Acute Care Hospitals,Proprietary,Yes,2
3,10007,MIZELL MEMORIAL HOSPITAL,702 N MAIN ST,OPP,AL,36467,COVINGTON,Acute Care Hospitals,Voluntary non-profit - Private,Yes,2
4,10008,CRENSHAW COMMUNITY HOSPITAL,101 HOSPITAL CIRCLE,LUVERNE,AL,36049,CRENSHAW,Acute Care Hospitals,Proprietary,Yes,2


In [8]:
print('Length of unique Facility ID vs total:',len(df_hosp_gen_info['Facility ID'].unique()),',',len(df_hosp_gen_info['Facility ID']))
print('Length of unique address vs total:',len(df_hosp_gen_info.Address.unique()),',',len(df_hosp_gen_info.Address))
print('Variations in hospital overall ratings:', Counter(df_hosp_gen_info['Hospital overall rating']))
print('Emerg_services:', Counter(df_hosp_gen_info['Emergency Services']))
df_hosp_gen_info.dtypes

Length of unique Facility ID vs total: 5306 , 5306
Length of unique address vs total: 5276 , 5306
Variations in hospital overall ratings: Counter({'Not Available': 1996, '3': 1006, '4': 979, '2': 682, '5': 452, '1': 191})
Emerg_services: Counter({'Yes': 4455, 'No': 851})


Facility ID                object
Facility Name              object
Address                    object
City                       object
State                      object
ZIP Code                    int64
County Name                object
Hospital Type              object
Hospital Ownership         object
Emergency Services         object
Hospital overall rating    object
dtype: object

-- Anurag to write blurb about above findings later

We'll be taking the cosine similarity of State and Emergency Services to test our MVP of the reccomendation algorithm. As a last step of cleaning, we need to change Emergency Services (Y/N) to binary classifications.

In [9]:
df_hosp_gen_info['Emergency Services'] = [1 if x=='Yes' else 0 for x in df_hosp_gen_info['Emergency Services']]
df_hosp_gen_info.head()

Unnamed: 0,Facility ID,Facility Name,Address,City,State,ZIP Code,County Name,Hospital Type,Hospital Ownership,Emergency Services,Hospital overall rating
0,10001,SOUTHEAST HEALTH MEDICAL CENTER,1108 ROSS CLARK CIRCLE,DOTHAN,AL,36301,HOUSTON,Acute Care Hospitals,Government - Hospital District or Authority,1,3
1,10005,MARSHALL MEDICAL CENTERS,2505 U S HIGHWAY 431 NORTH,BOAZ,AL,35957,MARSHALL,Acute Care Hospitals,Government - Hospital District or Authority,1,2
2,10006,NORTH ALABAMA MEDICAL CENTER,1701 VETERANS DRIVE,FLORENCE,AL,35630,LAUDERDALE,Acute Care Hospitals,Proprietary,1,2
3,10007,MIZELL MEMORIAL HOSPITAL,702 N MAIN ST,OPP,AL,36467,COVINGTON,Acute Care Hospitals,Voluntary non-profit - Private,1,2
4,10008,CRENSHAW COMMUNITY HOSPITAL,101 HOSPITAL CIRCLE,LUVERNE,AL,36049,CRENSHAW,Acute Care Hospitals,Proprietary,1,2


In [10]:
df_HCAHPS = pd.read_csv('/content/drive/MyDrive/siads697/data/HCAHPS-Hospital.csv')
df_HCAHPS = df_HCAHPS[['Facility ID','HCAHPS Measure ID','HCAHPS Question','HCAHPS Answer Percent']]
df_HCAHPS['HCAHPS Answer Percent'] = pd.to_numeric(df_HCAHPS['HCAHPS Answer Percent'], errors='coerce')
df_HCAHPS = df_HCAHPS.dropna(axis=0)

df_info = df_HCAHPS.groupby(['Facility ID']).count()
df = df_info[df_info['HCAHPS Question'] == 72].reset_index()
VALID_FACILITY_IDS = list(df['Facility ID'])

df_HCAHPS = df_HCAHPS[df_HCAHPS['Facility ID'].isin(VALID_FACILITY_IDS) == True]

TEST_ = df_HCAHPS[df_HCAHPS['Facility ID'].isin(VALID_FACILITY_IDS) == True]

  exec(code_obj, self.user_global_ns, self.user_ns)


In [11]:
%%time
# Here's how to calculate summary metrics using group_by and creating some new columns
question_type_dict = {'H_COMP_1_A_P': "nurses", 
                      'H_NURSE_RESPECT_A_P': "nurses", 
                      'H_NURSE_LISTEN_A_P': "nurses", 
                      'H_NURSE_EXPLAIN_A_P': "nurses",
                      'H_COMP_2_A_P': "doctors",
                      'H_DOCTOR_RESPECT_A_P': "doctors",
                      'H_DOCTOR_LISTEN_A_P': "doctors", 
                      'H_DOCTOR_EXPLAIN_A_P': "doctors",
                      'H_COMP_3_A_P': "patients", 
                      'H_CALL_BUTTON_A_P': "patients", 
                      'H_BATH_HELP_A_P': "patients", 
                      'H_COMP_5_A_P': "staffs", 
                      'H_MED_FOR_A_P': "staffs", 
                      'H_SIDE_EFFECTS_A_P': "staffs"
                     }
df_HCAHPS["measurement_type"] = df_HCAHPS.apply(lambda row: question_type_dict[row["HCAHPS Measure ID"]] if row["HCAHPS Measure ID"] in question_type_dict.keys() else "UNKNOWN", axis=1)
grouped = df_HCAHPS.groupby(['Facility ID', 'measurement_type']).mean()
grouped = grouped.drop("UNKNOWN", level="measurement_type").reset_index()

CPU times: user 2.82 s, sys: 20.2 ms, total: 2.84 s
Wall time: 2.86 s


In [12]:
df_HCAHPS

Unnamed: 0,Facility ID,HCAHPS Measure ID,HCAHPS Question,HCAHPS Answer Percent,measurement_type
0,010001,H_COMP_1_A_P,"Patients who reported that their nurses ""Alway...",77.0,nurses
1,010001,H_COMP_1_SN_P,"Patients who reported that their nurses ""Somet...",7.0,UNKNOWN
2,010001,H_COMP_1_U_P,"Patients who reported that their nurses ""Usual...",16.0,UNKNOWN
5,010001,H_NURSE_RESPECT_A_P,"Patients who reported that their nurses ""Alway...",81.0,nurses
6,010001,H_NURSE_RESPECT_SN_P,"Patients who reported that their nurses ""Somet...",7.0,UNKNOWN
...,...,...,...,...,...
450389,670260,H_HSP_RATING_7_8,Patients who gave their hospital a rating of 7...,11.0,UNKNOWN
450390,670260,H_HSP_RATING_9_10,Patients who gave their hospital a rating of 9...,83.0,UNKNOWN
450393,670260,H_RECMND_DN,"Patients who reported NO, they would probably ...",6.0,UNKNOWN
450394,670260,H_RECMND_DY,"Patients who reported YES, they would definite...",84.0,UNKNOWN


In [13]:
d = ['nurses', 'doctors', 'staffs', 'patients']
clinician_df = grouped[['Facility ID']]

for i in range(4):
    CLINICIANS = grouped[grouped['measurement_type']==d[i]][['Facility ID', 'HCAHPS Answer Percent']]
    CLINICIANS.rename(columns = {'HCAHPS Answer Percent':d[i]}, inplace = True)
    CLINICIANS[d[i]] = CLINICIANS[d[i]]
    clinician_df = clinician_df.merge(CLINICIANS, on='Facility ID')
    
clinician_df = clinician_df.drop_duplicates().dropna()


In [14]:
new = df_hosp_gen_info.merge(clinician_df, on='Facility ID')
new = new[new['Hospital overall rating'] != 'Not Available'].reset_index()
print('We have '+ str(len(new)) + ' facilties in the df after joining and dropping NA')
new = new.drop(labels=['index','Hospital Type','Hospital Ownership'], axis=1)
new

We have 3121 facilties in the df after joining and dropping NA


Unnamed: 0,Facility ID,Facility Name,Address,City,State,ZIP Code,County Name,Emergency Services,Hospital overall rating,nurses,doctors,staffs,patients
0,010001,SOUTHEAST HEALTH MEDICAL CENTER,1108 ROSS CLARK CIRCLE,DOTHAN,AL,36301,HOUSTON,1,3,77.00,80.75,63.666667,60.666667
1,010005,MARSHALL MEDICAL CENTERS,2505 U S HIGHWAY 431 NORTH,BOAZ,AL,35957,MARSHALL,1,2,75.00,84.00,56.000000,47.666667
2,010006,NORTH ALABAMA MEDICAL CENTER,1701 VETERANS DRIVE,FLORENCE,AL,35630,LAUDERDALE,1,2,69.00,77.00,55.333333,44.666667
3,010007,MIZELL MEMORIAL HOSPITAL,702 N MAIN ST,OPP,AL,36467,COVINGTON,1,2,84.00,85.75,77.333333,72.666667
4,010011,ST. VINCENT'S EAST,50 MEDICAL PARK EAST DRIVE,BIRMINGHAM,AL,35235,JEFFERSON,1,2,77.25,79.25,63.000000,65.333333
...,...,...,...,...,...,...,...,...,...,...,...,...,...
3116,670103,MEDICAL CITY ALLIANCE,3101 NORTH TARRANT PARKWAY,FORT WORTH,TX,76177,TARRANT,1,4,81.50,77.00,65.000000,65.333333
3117,670106,HCA HOUSTON HEALTHCARE PEARLAND,11100 SHADOW CREEK PARKWAY,PEARLAND,TX,77584,BRAZORIA,1,2,73.25,77.25,61.666667,62.000000
3118,670108,BAYLOR SCOTT & WHITE MEDICAL CENTER - MARBLE F...,810 W HIGHWAY 71,MARBLE FALLS,TX,78654,BURNET,1,5,85.00,81.25,66.000000,69.000000
3119,670120,THE HOSPITALS OF PROVIDENCE TRANSMOUNTAIN CAMPUS,2000 TRANSMOUNTAIN RD,EL PASO,TX,79911,EL PASO,1,4,73.00,74.00,54.000000,54.000000


## Step 2: Take User Input

In [15]:
#columns = ['Hospital overall rating','Emergency Services','State']
columns = ['State']
df = pd.DataFrame(columns = columns, dtype=object)

def user_input(df):
    # Take user input for a series of factors
    return df.append({
        #'Hospital overall rating': int(input('Please rate your desired hospital on scale of 1-5:   ')),
        #'Emergency Services': int(input('Do you need emergency services? (Yes = 1, No = 0)     ')),
        'State':str(input('Which State do you live in?     ')),
        'Doctors': int(input('Rate your ideal doctor on a scale of 0-100%     ')),
        'Nurses': int(input('Rate your ideal nurses on a scale of 0-100%     ')),
        'Patients': int(input('Rate your ideal patients on a scale of 0-100%     ')),
        'Staffs': int(input('Rate your ideal staffs on a scale of 0-100%     '))},
    ignore_index = True)

In [18]:
user_input_df = user_input(df)
user_input_df.head()

Which State do you live in?     MI
Rate your ideal doctor on a scale of 0-100%     86
Rate your ideal nurses on a scale of 0-100%     54
Rate your ideal patients on a scale of 0-100%     39
Rate your ideal staffs on a scale of 0-100%     6


Unnamed: 0,State,Doctors,Nurses,Patients,Staffs
0,MI,86.0,54.0,39.0,6.0


## Step 3: Reccommendar System MVP

Our hospital reccomendation system utilizes the cleaned dataset from Step 1 and the User Input from Step 2. We use cosine similarity to determine the top hospitals based on the users target inputs. This section is broken into a set of function that to conduct vectorization of string parmaters as well as numeric parameters to determine cosine similarity of hospitals relative to a user's need. 

For our MVP, we tested our recommender sytem on Emergency Services (Y/N) and State of hospital as initial inputs. Finally, we sort the outcomes by cosine to determine the list of best hospitals. The outcomes so far look like it works relatively well, but we'll need to see how this fluctuates with additional parameters and available data.

In [19]:
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.metrics.pairwise import cosine_similarity

In [20]:
def give_me_hospitals(df_hospital, df_patient):
    
    # Collaborative Filtering
    df_hospital = df_hospital[ df_hospital['State'] == df_patient['State'][0] ]
    
    #Patient_Preferences = ['State','Emergency Services']
    #for values in Patient_Preferences:
        #df_hospital = df_hospital[df_hospital[values] == df_patient[values][0]]
        
    #df_hospital = df_hospital[df_hospital['Hospital overall rating'] == str(df_patient['Hospital overall rating'][0])]
    
    
    # Begin attaining X parameters directly based on hospital DF
    X_no2 = np.array([df_hospital['doctors'],
                      df_hospital['nurses'],
                      df_hospital['patients'],
                      df_hospital['staffs']]).reshape([-1,4], order='F')
    
    
    # Begin attaining Y paramters directly based on patient DF
    Y_Final = np.array([df_patient['Doctors'],
                        df_patient['Nurses'],
                        df_patient['Patients'],
                        df_patient['Staffs']]).reshape(-1,4)

    
    Cos = cosine_similarity(X_no2, Y_Final)
    #print(Cos)
    df_hospital['Cosine Similarity'] = Cos

    return df_hospital.sort_values('Cosine Similarity', ascending=False).reset_index()

In [21]:
df_hospital = new.copy()
df_patient = user_input_df.copy()
give_me_hospitals(df_hospital, df_patient)

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy


Unnamed: 0,index,Facility ID,Facility Name,Address,City,State,ZIP Code,County Name,Emergency Services,Hospital overall rating,nurses,doctors,staffs,patients,Cosine Similarity
0,1462,230241,ASCENSION RIVER DISTRICT HOSPITAL,4100 RIVER RD,EAST CHINA,MI,48054,ST. CLAIR,1,3,74.25,72.75,56.000000,65.666667,0.916012
1,1406,230041,MCLAREN BAY REGION,1900 COLUMBUS AVE,BAY CITY,MI,48708,BAY,1,3,77.75,75.25,58.000000,62.333333,0.904287
2,1438,230132,HURLEY MEDICAL CENTER,ONE HURLEY PLAZA,FLINT,MI,48503,GENESEE,1,2,74.00,70.00,55.000000,60.666667,0.902825
3,1390,230003,SPECTRUM HEALTH ZEELAND COMMUNITY HOSPITAL,8333 FELCH ST,ZEELAND,MI,49464,OTTAWA,0,5,82.75,81.75,65.666667,71.000000,0.901813
4,1394,230019,"ASCENSION PROVIDENCE HOSPITAL, SOUTHFIELD AND ...",16001 W NINE MILE RD,SOUTHFIELD,MI,48075,OAKLAND,1,4,76.75,75.25,58.000000,63.333333,0.896412
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
93,1428,230097,MUNSON MEDICAL CENTER,1105 SIXTH STREET,TRAVERSE CITY,MI,49684,GRAND TRAVERSE,1,5,81.75,80.25,63.000000,65.000000,0.770567
94,1472,230303,MUNSON HEALTHCARE MANISTEE HOSPITAL,1465 E PARKDALE AVE,MANISTEE,MI,49660,MANISTEE,1,3,79.25,85.00,70.000000,71.000000,0.766200
95,1432,230105,MCLAREN NORTHERN MICHIGAN,416 CONNABLE AVE,PETOSKEY,MI,49770,EMMET,1,5,81.00,77.75,63.666667,70.333333,0.761659
96,1431,230104,HARPER UNIVERSITY HOSPITAL,3990 JOHN R STREET,DETROIT,MI,48201,WAYNE,1,2,76.50,74.25,58.000000,51.666667,0.751572


### Evaluation

In [22]:
import random

random.seed(42)

In [23]:
def rand_query_generator(df_hospital, n=5):
    """
    generates n random queries based on six input parameters incorporated in the recommender system
    """
    
    queries = []
    
    for i in range(n):
        query = []
        #query.append(random.randint(0,1)) 
        query.append(random.choice(new.State.unique()))                       #state
        query.append(random.randint(int(df_hospital.doctors.min()), 100))     #doctors
        query.append(random.randint(int(df_hospital.nurses.min()), 100))      #nurses
        query.append(random.randint(int(df_hospital.patients.min()), 100))    #patients
        query.append(random.randint(int(df_hospital.staffs.min()), 100))      #staffs
        
        queries.append(query)
        
    df_queries = pd.DataFrame(queries, columns=[#'Emergency Services',
                                                'State', 'Doctors', 'Nurses', 'Patients', 'Staffs'])
    
    return df_queries

#### Precision and Recall

In [24]:
def evaluation_pre_rec_at_n(df_hospital, queries, n=10):
    """
    takes overall hospital data and evaluates precision and recall for 
    multiple queries using the relevant and recommended hospitals
    """
    
    pre_at_n, rec_at_n = {}, {}
    
    for i in range(len(queries)):
        print(i)
        try:
            # obtaining hospital recommendations
            df_patient = queries.iloc[i:i+1].reset_index(drop=True)
            
            hosp_rec = give_me_hospitals(df_hospital, df_patient)
            print('query #' + str(i+1) + ': ' + str(len(hosp_rec)) + ' recommendations generated')

            hosp_rel = df_hospital[(df_hospital['State'] == df_patient['State'][0])]
            #hosp_rel.replace('Not Available', np.nan, inplace=True)
            #hosp_rel.sort_values(by=['Hospital overall rating'], ascending=False, inplace=True)
            #hosp_rel.sort_values(by=['Emergency Services'], ascending=False, inplace=True)
            
            rating_max = str(np.array(list(map(int, hosp_rel['Hospital overall rating'].values))).max())
            hosp_rel = hosp_rel[hosp_rel['Hospital overall rating'] == rating_max]
            
            #hosp_rel = hosp_rel[:n]
            print('relevant hosp = ', len(hosp_rel))
            
            
            # calculating precision and recall for each query
            if n == -1 or n <= len(hosp_rec):
                hosp_rec = hosp_rec[:n]
            
            rel_rec = list(set(hosp_rec['Facility ID']).intersection(hosp_rel['Facility ID']))
            print('relevant recommended hosp = ', len(rel_rec))
            
            pre_at_n['q'+str(i+1)] = 0 if rel_rec==0 else len(rel_rec) / len(hosp_rec)
            
            rec_at_n['q'+str(i+1)] = 0 if rel_rec==0 else len(rel_rec) / len(hosp_rel)
        
        except:
            # errors encountered: ValueError, KeyError, ZeroDivisionError
            print('query #' + str(i+1) + ': ' + 'Too few or no relevant hospitals found in the database.')
            pre_at_n['q'+str(i+1)] = np.nan
            rec_at_n['q'+str(i+1)] = np.nan
    
    return pre_at_n, rec_at_n

#### Mean Average Precision

In [25]:
def evaluation_avg_pre(df_hospital, queries, cutoff=10):
    """
    takes overall hospital data and evaluates (mean) average precision for 
    multiple queries using the relevant and recommended hospitals
    """
    
    avg_pre, mean_avg_pre = {}, {}
    
    for i in range(len(queries)):
        
        rel_counter =  0
        pre = []
        
        try:
            # obtaining hospital recommendations
            df_patient = queries.iloc[i:i+1].reset_index(drop=True)
            
            hosp_rec = give_me_hospitals(df_hospital, df_patient)
            print('query #' + str(i+1) + ': ' + str(len(hosp_rec)) + ' recommendations generated')

            hosp_rel = df_hospital[(df_hospital['State'] == df_patient['State'][0])]
            hosp_rel.sort_values(by=['Emergency Services'], ascending=False, inplace=True)
            
            rating_max = str(np.array(list(map(int, hosp_rel['Hospital overall rating'].values))).max())
            hosp_rel = hosp_rel[hosp_rel['Hospital overall rating'] == rating_max]
            
            #hosp_rel = hosp_rel[:n]
            print('relevant hosp = ', len(hosp_rel))
            
            
            # calculating average precision for each query
            if cutoff == -1 or cutoff <= len(hosp_rec):
                hosp_rec = hosp_rec[:cutoff]
            
            for j, hosp in enumerate(hosp_rec['Facility ID']):
                if hosp in hosp_rel['Facility ID']:
                    rel_counter += 1
                    pre.append(rel_counter/(j+1))

            #avg_pre['q'+str(i+1)] = sum(pre) / len(hosp_rel)
            avg_pre['q'+str(i+1)] = 0 if hosp_rel.empty else sum(pre) / len(hosp_rel)
            
        except:
            # errors encountered: ValueError, KeyError
            print('query #' + str(i+1) + ': ' + 'Too few or no hospitals found in the database.')
    
    
    # calculating mean of average precisions for all queries
    mean_avg_pre = sum(avg_pre.values()) / len(avg_pre.values())
    
    return avg_pre, mean_avg_pre

#### Normalized DCG

In [26]:
import math

def evaluation_nDCG_at_n(df_hospital, queries, n=10, base=2):
    """
    takes overall hospital data and evaluates normalized DCG for 
    multiple queries using the relevant and recommended hospitals
    """
    
    ndcg = {}
    
    for i in range(len(queries)):
        
        try:
            # obtaining hospital recommendations
            df_patient = queries.iloc[i:i+1].reset_index(drop=True)
            
            hosp_rec = give_me_hospitals(df_hospital, df_patient)
            print('query #' + str(i+1) + ': ' + str(len(hosp_rec)) + ' recommendations generated')
            
            hosp_rel = df_hospital[(df_hospital['State'] == df_patient['State'][0])]
            hosp_rel.sort_values(by=['Emergency Services'], ascending=False, inplace=True)
            
            rating_max = str(np.array(list(map(int, hosp_rel['Hospital overall rating'].values))).max())
            hosp_rel = hosp_rel[hosp_rel['Hospital overall rating'] == rating_max]
            
            #hosp_rel = hosp_rel[:n]
            print('relevant hosp = ', len(hosp_rel))
            
            
            # calculating relevant hospital ranks
            rank = len(hosp_rel) + 1
            rel = {}
            
            for h in hosp_rel['Facility ID']:
                rel[h] = rank
                rank -= 1
            
            
            # calculating system order ranks for recommended hospitals
            rec_sys = {}

            for h in hosp_rec['Facility ID']:
                if h in hosp_rel['Facility ID']:
                    rec_sys[h] = rel[h]
                else:
                    rec_sys[h] = 1
            
            
            # calculating dcg and ndcg
            if n == -1 or n > len(hosp_rec):
                sys_scores = list(rec_sys.values())
                idl_scores = list(rel.values())
            else:
                sys_scores = list(rec_sys.values())[:n]
                idl_scores = list(rel.values())[:n]

            if len(idl_scores) < len(sys_scores):
                idl_scores += [1] * (len(sys_scores)-len(idl_scores))
            
            sys_list = []
            idl_list = []

            for j, (sys, idl) in enumerate(zip(sys_scores, idl_scores)):

                sys_rate = sys / math.log(j+1, base) if j > (base-2) else sys
                sys_list.append(sys_rate)

                idl_rate = idl / math.log(j+1, base) if j > (base-2) else idl
                idl_list.append(idl_rate) 

            sys_dcg = sum(sys_list)
            idl_dcg = sum(idl_list)

            ndcg['q'+str(i+1)] = sys_dcg / idl_dcg
            
            
        except:
            # errors encountered: ValueError, KeyError
            print('query #' + str(i+1) + ': ' + 'Too few or no hospitals found in the database.')
            
    return ndcg

##### testing

In [27]:
queries_test = rand_query_generator(new)
queries_test

Unnamed: 0,State,Doctors,Nurses,Patients,Staffs
0,RI,65,57,85,55
1,IA,72,64,85,44
2,TN,92,61,75,65
3,AZ,59,61,51,52
4,NY,96,57,73,50


In [28]:
print("PRECISION & RECALL\n")
evaluation_pre_rec_at_n(new, queries_test)

PRECISION & RECALL

0
query #1: 10 recommendations generated
relevant hosp =  1
relevant recommended hosp =  1
1
query #2: 55 recommendations generated
relevant hosp =  11
relevant recommended hosp =  2
2
query #3: 74 recommendations generated
relevant hosp =  2
relevant recommended hosp =  1
3
query #4: 50 recommendations generated
relevant hosp =  4
relevant recommended hosp =  2
4
query #5: 134 recommendations generated
relevant hosp =  7
relevant recommended hosp =  1


A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user

({'q1': 0.1, 'q2': 0.2, 'q3': 0.1, 'q4': 0.2, 'q5': 0.1},
 {'q1': 1.0,
  'q2': 0.18181818181818182,
  'q3': 0.5,
  'q4': 0.5,
  'q5': 0.14285714285714285})

In [29]:
print("MEAN AVERAGE PRECISION\n")
evaluation_avg_pre(new, queries_test)

MEAN AVERAGE PRECISION

query #1: 10 recommendations generated
relevant hosp =  1
query #2: 55 recommendations generated
relevant hosp =  11
query #3: 74 recommendations generated
relevant hosp =  2
query #4: 50 recommendations generated
relevant hosp =  4
query #5: 134 recommendations generated
relevant hosp =  7


A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return func(*args, **kwargs)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return func(*args, **kwargs)


({'q1': 0.0, 'q2': 0.0, 'q3': 0.0, 'q4': 0.0, 'q5': 0.0}, 0.0)

In [30]:
print("NDCG\n")
evaluation_nDCG_at_n(new, queries_test)

NDCG

query #1: 10 recommendations generated
relevant hosp =  1
query #2: 55 recommendations generated
relevant hosp =  11
query #3: 74 recommendations generated
relevant hosp =  2
query #4: 50 recommendations generated
relevant hosp =  4
query #5: 134 recommendations generated
relevant hosp =  7


A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return func(*args, **kwargs)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return func(*args, **kwargs)


{'q1': 0.8401149768189771,
 'q2': 0.11439839887088857,
 'q3': 0.6365616336988094,
 'q4': 0.37488311901084637,
 'q5': 0.20341748548710972}

##### Scaled up testing and Visualization

In [31]:
q_500 = rand_query_generator(new, n=500)
q_500.head()

Unnamed: 0,State,Doctors,Nurses,Patients,Staffs
0,UT,99,100,72,64
1,IN,86,93,55,89
2,AL,68,100,65,59
3,KY,67,69,99,86
4,MA,64,61,62,44


In [45]:
ndcg_500 = evaluation_nDCG_at_n(new, q_500)
#ndcg_500

query #1: 24 recommendations generated
relevant hosp =  12


A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return func(*args, **kwargs)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return func(*args, **kwargs)


query #2: 89 recommendations generated
relevant hosp =  10
query #3: 65 recommendations generated
relevant hosp =  5
query #4: 64 recommendations generated
relevant hosp =  5
query #5: 53 recommendations generated
relevant hosp =  12
query #6: 98 recommendations generated
relevant hosp =  20
query #7: 50 recommendations generated
relevant hosp =  4
query #8: 47 recommendations generated
relevant hosp =  2
query #9: 10 recommendations generated
relevant hosp =  1
query #10: 24 recommendations generated
relevant hosp =  12
query #11: 55 recommendations generated
relevant hosp =  6
query #12: 25 recommendations generated
relevant hosp =  4
query #13: 54 recommendations generated
relevant hosp =  13
query #14: 16 recommendations generated
relevant hosp =  6
query #15: 49 recommendations generated
relevant hosp =  4
query #16: 10 recommendations generated
relevant hosp =  1
query #17: 64 recommendations generated
relevant hosp =  5
query #18: 74 recommendations generated
relevant hosp =  2


A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return func(*args, **kwargs)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return func(*args, **kwargs)


 2
query #22: 50 recommendations generated
relevant hosp =  6
query #23: 10 recommendations generated
relevant hosp =  1
query #24: 46 recommendations generated
relevant hosp =  10
query #25: 134 recommendations generated
relevant hosp =  7
query #26: 6 recommendations generated
relevant hosp =  4
query #27: 74 recommendations generated
relevant hosp =  2
query #28: 47 recommendations generated
relevant hosp =  2
query #29: 127 recommendations generated
relevant hosp =  33
query #30: 74 recommendations generated
relevant hosp =  2
query #31: 53 recommendations generated
relevant hosp =  12
query #32: 23 recommendations generated
relevant hosp =  3
query #33: 75 recommendations generated
relevant hosp =  8
query #34: 10 recommendations generated
relevant hosp =  1
query #35: 22 recommendations generated
relevant hosp =  3
query #36: 10 recommendations generated
relevant hosp =  1
query #37: 23 recommendations generated
relevant hosp =  2
query #38: 84 recommendations generated
relevant 

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return func(*args, **kwargs)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return func(*args, **kwargs)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy


query #43: 32 recommendations generated
relevant hosp =  4
query #44: 225 recommendations generated
relevant hosp =  26
query #45: 49 recommendations generated
relevant hosp =  4
query #46: 24 recommendations generated
relevant hosp =  2
query #47: 53 recommendations generated
relevant hosp =  12
query #48: 46 recommendations generated
relevant hosp =  10
query #49: 10 recommendations generated
relevant hosp =  1
query #50: 50 recommendations generated
relevant hosp =  4
query #51: 64 recommendations generated
relevant hosp =  5
query #52: 6 recommendations generated
relevant hosp =  1
query #53: 26 recommendations generated
relevant hosp =  2
query #54: 25 recommendations generated
relevant hosp =  4
query #55: 21 recommendations generated
relevant hosp =  5
query #56: 49 recommendations generated
relevant hosp =  4
query #57: 53 recommendations generated
relevant hosp =  12
query #58: 10 recommendations generated
relevant hosp =  1
query #59: 64 recommendations generated
relevant hos

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return func(*args, **kwargs)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return func(*args, **kwargs)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy


query #65: 26 recommendations generated
relevant hosp =  2
query #66: 11 recommendations generated
relevant hosp =  2
query #67: 158 recommendations generated
relevant hosp =  25
query #68: 49 recommendations generated
relevant hosp =  4
query #69: 42 recommendations generated
relevant hosp =  3
query #70: 134 recommendations generated
relevant hosp =  7
query #71: 134 recommendations generated
relevant hosp =  7
query #72: 273 recommendations generated
relevant hosp =  38
query #73: 60 recommendations generated
relevant hosp =  12
query #74: 3 recommendations generated
relevant hosp =  3
query #75: 60 recommendations generated
relevant hosp =  12
query #76: 134 recommendations generated
relevant hosp =  24
query #77: 79 recommendations generated
relevant hosp =  7
query #78: 23 recommendations generated
relevant hosp =  3
query #79: 3 recommendations generated
relevant hosp =  3
query #80: 134 recommendations generated
relevant hosp =  24
query #81: 98 recommendations generated
releva

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return func(*args, **kwargs)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return func(*args, **kwargs)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy


query #86: 25 recommendations generated
relevant hosp =  4
query #87: 136 recommendations generated
relevant hosp =  21
query #88: 10 recommendations generated
relevant hosp =  1
query #89: 49 recommendations generated
relevant hosp =  4
query #90: 64 recommendations generated
relevant hosp =  5
query #91: 6 recommendations generated
relevant hosp =  1
query #92: 24 recommendations generated
relevant hosp =  2
query #93: 6 recommendations generated
relevant hosp =  4
query #94: 10 recommendations generated
relevant hosp =  1
query #95: 158 recommendations generated
relevant hosp =  25
query #96: 54 recommendations generated
relevant hosp =  13
query #97: 55 recommendations generated
relevant hosp =  11
query #98: 127 recommendations generated
relevant hosp =  33
query #99: 55 recommendations generated
relevant hosp =  11
query #100: 9 recommendations generated
relevant hosp =  1
query #101: 21 recommendations generated
relevant hosp =  5
query #102: 26 recommendations generated
relevan

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return func(*args, **kwargs)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return func(*args, **kwargs)


query #108: 79 recommendations generated
relevant hosp =  7
query #109: 6 recommendations generated
relevant hosp =  4
query #110: 50 recommendations generated
relevant hosp =  6
query #111: 98 recommendations generated
relevant hosp =  20
query #112: 134 recommendations generated
relevant hosp =  7
query #113: 22 recommendations generated
relevant hosp =  3
query #114: 65 recommendations generated
relevant hosp =  5
query #115: 2 recommendations generated
relevant hosp =  2
query #116: 273 recommendations generated
relevant hosp =  38
query #117: 16 recommendations generated
relevant hosp =  6
query #118: 16 recommendations generated
relevant hosp =  6
query #119: 61 recommendations generated
relevant hosp =  6
query #120: 16 recommendations generated
relevant hosp =  6
query #121: 12 recommendations generated
relevant hosp =  7
query #122: 127 recommendations generated
relevant hosp =  33
query #123: 32 recommendations generated
relevant hosp =  4
query #124: 23 recommendations gener

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return func(*args, **kwargs)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return func(*args, **kwargs)


11
query #130: 273 recommendations generated
relevant hosp =  38
query #131: 60 recommendations generated
relevant hosp =  12
query #132: 79 recommendations generated
relevant hosp =  7
query #133: 65 recommendations generated
relevant hosp =  5
query #134: 84 recommendations generated
relevant hosp =  25
query #135: 55 recommendations generated
relevant hosp =  11
query #136: 23 recommendations generated
relevant hosp =  3
query #137: 42 recommendations generated
relevant hosp =  3
query #138: 134 recommendations generated
relevant hosp =  7
query #139: 89 recommendations generated
relevant hosp =  6
query #140: 55 recommendations generated
relevant hosp =  11
query #141: 88 recommendations generated
relevant hosp =  8
query #142: 24 recommendations generated
relevant hosp =  2
query #143: 53 recommendations generated
relevant hosp =  12
query #144: 158 recommendations generated
relevant hosp =  25
query #145: 24 recommendations generated
relevant hosp =  12
query #146: 53 recommendat

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return func(*args, **kwargs)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return func(*args, **kwargs)


relevant hosp =  5
query #151: 89 recommendations generated
relevant hosp =  10
query #152: 23 recommendations generated
relevant hosp =  2
query #153: 74 recommendations generated
relevant hosp =  2
query #154: 6 recommendations generated
relevant hosp =  1
query #155: 79 recommendations generated
relevant hosp =  7
query #156: 48 recommendations generated
relevant hosp =  16
query #157: 23 recommendations generated
relevant hosp =  3
query #158: 42 recommendations generated
relevant hosp =  3
query #159: 75 recommendations generated
relevant hosp =  8
query #160: 21 recommendations generated
relevant hosp =  5
query #161: 2 recommendations generated
relevant hosp =  2
query #162: 89 recommendations generated
relevant hosp =  10
query #163: 49 recommendations generated
relevant hosp =  4
query #164: 22 recommendations generated
relevant hosp =  3
query #165: 6 recommendations generated
relevant hosp =  1
query #166: 134 recommendations generated
relevant hosp =  24
query #167: 54 reco

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return func(*args, **kwargs)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return func(*args, **kwargs)


 12
query #171: 22 recommendations generated
relevant hosp =  3
query #172: 10 recommendations generated
relevant hosp =  1
query #173: 29 recommendations generated
relevant hosp =  1
query #174: 60 recommendations generated
relevant hosp =  12
query #175: 16 recommendations generated
relevant hosp =  6
query #176: 53 recommendations generated
relevant hosp =  12
query #177: 25 recommendations generated
relevant hosp =  4
query #178: 23 recommendations generated
relevant hosp =  3
query #179: 88 recommendations generated
relevant hosp =  8
query #180: 10 recommendations generated
relevant hosp =  1
query #181: 11 recommendations generated
relevant hosp =  2
query #182: 55 recommendations generated
relevant hosp =  11
query #183: 55 recommendations generated
relevant hosp =  11
query #184: 47 recommendations generated
relevant hosp =  2
query #185: 42 recommendations generated
relevant hosp =  3
query #186: 84 recommendations generated
relevant hosp =  25
query #187: 136 recommendations

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return func(*args, **kwargs)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return func(*args, **kwargs)


query #193: 63 recommendations generated
relevant hosp =  3
query #194: 2 recommendations generated
relevant hosp =  2
query #195: 23 recommendations generated
relevant hosp =  3
query #196: 6 recommendations generated
relevant hosp =  4
query #197: 10 recommendations generated
relevant hosp =  1
query #198: 63 recommendations generated
relevant hosp =  3
query #199: 88 recommendations generated
relevant hosp =  8
query #200: 22 recommendations generated
relevant hosp =  3
query #201: 29 recommendations generated
relevant hosp =  1
query #202: 50 recommendations generated
relevant hosp =  4
query #203: 6 recommendations generated
relevant hosp =  1
query #204: 65 recommendations generated
relevant hosp =  5
query #205: 24 recommendations generated
relevant hosp =  2
query #206: 50 recommendations generated
relevant hosp =  6
query #207: 79 recommendations generated
relevant hosp =  7
query #208: 74 recommendations generated
relevant hosp =  2
query #209: 60 recommendations generated
re

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return func(*args, **kwargs)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return func(*args, **kwargs)


query #216: 12 recommendations generated
relevant hosp =  7
query #217: 21 recommendations generated
relevant hosp =  5
query #218: 61 recommendations generated
relevant hosp =  6
query #219: 61 recommendations generated
relevant hosp =  6
query #220: 50 recommendations generated
relevant hosp =  6
query #221: 22 recommendations generated
relevant hosp =  3
query #222: 49 recommendations generated
relevant hosp =  4
query #223: 273 recommendations generated
relevant hosp =  38
query #224: 136 recommendations generated
relevant hosp =  21
query #225: 23 recommendations generated
relevant hosp =  3
query #226: 64 recommendations generated
relevant hosp =  5
query #227: 48 recommendations generated
relevant hosp =  16
query #228: 2 recommendations generated
relevant hosp =  2
query #229: 48 recommendations generated
relevant hosp =  16
query #230: 60 recommendations generated
relevant hosp =  12
query #231: 13 recommendations generated
relevant hosp =  2
query #232: 75 recommendations gen

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return func(*args, **kwargs)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return func(*args, **kwargs)


query #238: 127 recommendations generated
relevant hosp =  33
query #239: 50 recommendations generated
relevant hosp =  6
query #240: 79 recommendations generated
relevant hosp =  7
query #241: 10 recommendations generated
relevant hosp =  1
query #242: 88 recommendations generated
relevant hosp =  8
query #243: 10 recommendations generated
relevant hosp =  1
query #244: 127 recommendations generated
relevant hosp =  33
query #245: 225 recommendations generated
relevant hosp =  26
query #246: 10 recommendations generated
relevant hosp =  1
query #247: 29 recommendations generated
relevant hosp =  1
query #248: 48 recommendations generated
relevant hosp =  16
query #249: 6 recommendations generated
relevant hosp =  4
query #250: 24 recommendations generated
relevant hosp =  12
query #251: 21 recommendations generated
relevant hosp =  5
query #252: 127 recommendations generated
relevant hosp =  33
query #253: 136 recommendations generated
relevant hosp =  21
query #254: 64 recommendation

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return func(*args, **kwargs)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return func(*args, **kwargs)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy


query #260: 47 recommendations generated
relevant hosp =  2
query #261: 2 recommendations generated
relevant hosp =  2
query #262: 26 recommendations generated
relevant hosp =  2
query #263: 50 recommendations generated
relevant hosp =  4
query #264: 53 recommendations generated
relevant hosp =  12
query #265: 23 recommendations generated
relevant hosp =  3
query #266: 49 recommendations generated
relevant hosp =  4
query #267: 50 recommendations generated
relevant hosp =  6


A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return func(*args, **kwargs)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return func(*args, **kwargs)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy


query #268: 49 recommendations generated
relevant hosp =  4
query #269: 10 recommendations generated
relevant hosp =  1
query #270: 75 recommendations generated
relevant hosp =  8
query #271: 273 recommendations generated
relevant hosp =  38
query #272: 61 recommendations generated
relevant hosp =  6
query #273: 25 recommendations generated
relevant hosp =  4
query #274: 225 recommendations generated
relevant hosp =  26


A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return func(*args, **kwargs)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return func(*args, **kwargs)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy


query #275: 26 recommendations generated
relevant hosp =  2
query #276: 32 recommendations generated
relevant hosp =  4
query #277: 60 recommendations generated
relevant hosp =  12
query #278: 3 recommendations generated
relevant hosp =  3
query #279: 23 recommendations generated
relevant hosp =  2
query #280: 10 recommendations generated
relevant hosp =  1
query #281: 65 recommendations generated
relevant hosp =  5
query #282: 89 recommendations generated
relevant hosp =  6
query #283: 46 recommendations generated
relevant hosp =  10
query #284: 225 recommendations generated
relevant hosp =  26
query #285: 22 recommendations generated
relevant hosp =  3
query #286: 47 recommendations generated
relevant hosp =  2


A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return func(*args, **kwargs)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return func(*args, **kwargs)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy


query #287: 225 recommendations generated
relevant hosp =  26
query #288: 16 recommendations generated
relevant hosp =  6
query #289: 74 recommendations generated
relevant hosp =  2
query #290: 61 recommendations generated
relevant hosp =  6
query #291: 89 recommendations generated
relevant hosp =  10
query #292: 74 recommendations generated
relevant hosp =  2
query #293: 10 recommendations generated
relevant hosp =  1
query #294: 29 recommendations generated
relevant hosp =  1
query #295: 11 recommendations generated
relevant hosp =  2
query #296: 42 recommendations generated
relevant hosp =  3
query #297: 26 recommendations generated
relevant hosp =  2
query #298: 64 recommendations generated
relevant hosp =  5
query #299: 63 recommendations generated
relevant hosp =  3
query #300: 23 recommendations generated
relevant hosp =  3
query #301: 89 recommendations generated
relevant hosp =  10
query #302: 26 recommendations generated
relevant hosp =  2
query #303: 6 recommendations genera

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return func(*args, **kwargs)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return func(*args, **kwargs)


query #311: 25 recommendations generated
relevant hosp =  4
query #312: 23 recommendations generated
relevant hosp =  3
query #313: 225 recommendations generated
relevant hosp =  26
query #314: 11 recommendations generated
relevant hosp =  2
query #315: 23 recommendations generated
relevant hosp =  2
query #316: 64 recommendations generated
relevant hosp =  5
query #317: 24 recommendations generated
relevant hosp =  2
query #318: 25 recommendations generated
relevant hosp =  4
query #319: 49 recommendations generated
relevant hosp =  4
query #320: 22 recommendations generated
relevant hosp =  3
query #321: 134 recommendations generated
relevant hosp =  24
query #322: 9 recommendations generated
relevant hosp =  1
query #323: 26 recommendations generated
relevant hosp =  2
query #324: 75 recommendations generated
relevant hosp =  8
query #325: 13 recommendations generated
relevant hosp =  2
query #326: 26 recommendations generated
relevant hosp =  2
query #327: 158 recommendations gener

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return func(*args, **kwargs)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return func(*args, **kwargs)


query #333: 55 recommendations generated
relevant hosp =  6
query #334: 24 recommendations generated
relevant hosp =  2
query #335: 50 recommendations generated
relevant hosp =  4
query #336: 49 recommendations generated
relevant hosp =  4
query #337: 79 recommendations generated
relevant hosp =  7
query #338: 24 recommendations generated
relevant hosp =  12
query #339: 42 recommendations generated
relevant hosp =  3
query #340: 64 recommendations generated
relevant hosp =  5
query #341: 24 recommendations generated
relevant hosp =  12
query #342: 61 recommendations generated
relevant hosp =  6
query #343: 25 recommendations generated
relevant hosp =  4
query #344: 55 recommendations generated
relevant hosp =  6
query #345: 89 recommendations generated
relevant hosp =  10
query #346: 79 recommendations generated
relevant hosp =  7
query #347: 42 recommendations generated
relevant hosp =  3
query #348: 158 recommendations generated
relevant hosp =  25
query #349: 48 recommendations gene

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return func(*args, **kwargs)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return func(*args, **kwargs)


relevant hosp =  10
query #357: 54 recommendations generated
relevant hosp =  13
query #358: 84 recommendations generated
relevant hosp =  25
query #359: 13 recommendations generated
relevant hosp =  2
query #360: 127 recommendations generated
relevant hosp =  33
query #361: 225 recommendations generated
relevant hosp =  26
query #362: 225 recommendations generated
relevant hosp =  26
query #363: 12 recommendations generated
relevant hosp =  7
query #364: 273 recommendations generated
relevant hosp =  38
query #365: 48 recommendations generated
relevant hosp =  16
query #366: 23 recommendations generated
relevant hosp =  3
query #367: 11 recommendations generated
relevant hosp =  2
query #368: 49 recommendations generated
relevant hosp =  4
query #369: 49 recommendations generated
relevant hosp =  4
query #370: 32 recommendations generated
relevant hosp =  4
query #371: 64 recommendations generated
relevant hosp =  5
query #372: 32 recommendations generated
relevant hosp =  4
query #37

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return func(*args, **kwargs)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return func(*args, **kwargs)


query #377: 6 recommendations generated
relevant hosp =  4
query #378: 12 recommendations generated
relevant hosp =  7
query #379: 89 recommendations generated
relevant hosp =  10
query #380: 84 recommendations generated
relevant hosp =  25
query #381: 2 recommendations generated
relevant hosp =  2
query #382: 273 recommendations generated
relevant hosp =  38
query #383: 79 recommendations generated
relevant hosp =  7
query #384: 79 recommendations generated
relevant hosp =  7
query #385: 3 recommendations generated
relevant hosp =  3
query #386: 48 recommendations generated
relevant hosp =  16
query #387: 46 recommendations generated
relevant hosp =  10
query #388: 29 recommendations generated
relevant hosp =  1
query #389: 89 recommendations generated
relevant hosp =  6
query #390: 88 recommendations generated
relevant hosp =  8
query #391: 98 recommendations generated
relevant hosp =  20
query #392: 55 recommendations generated
relevant hosp =  11
query #393: 12 recommendations gene

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return func(*args, **kwargs)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return func(*args, **kwargs)


relevant hosp =  12
query #398: 10 recommendations generated
relevant hosp =  1
query #399: 63 recommendations generated
relevant hosp =  3
query #400: 64 recommendations generated
relevant hosp =  5
query #401: 273 recommendations generated
relevant hosp =  38
query #402: 49 recommendations generated
relevant hosp =  4
query #403: 48 recommendations generated
relevant hosp =  16
query #404: 23 recommendations generated
relevant hosp =  3
query #405: 2 recommendations generated
relevant hosp =  2
query #406: 60 recommendations generated
relevant hosp =  12
query #407: 63 recommendations generated
relevant hosp =  3
query #408: 24 recommendations generated
relevant hosp =  2
query #409: 48 recommendations generated
relevant hosp =  16
query #410: 55 recommendations generated
relevant hosp =  11
query #411: 3 recommendations generated
relevant hosp =  3
query #412: 61 recommendations generated
relevant hosp =  6
query #413: 74 recommendations generated
relevant hosp =  2
query #414: 49 r

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return func(*args, **kwargs)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return func(*args, **kwargs)



relevant hosp =  1
query #419: 6 recommendations generated
relevant hosp =  1
query #420: 47 recommendations generated
relevant hosp =  2
query #421: 16 recommendations generated
relevant hosp =  6
query #422: 32 recommendations generated
relevant hosp =  4
query #423: 29 recommendations generated
relevant hosp =  1
query #424: 55 recommendations generated
relevant hosp =  6
query #425: 158 recommendations generated
relevant hosp =  25
query #426: 25 recommendations generated
relevant hosp =  4
query #427: 88 recommendations generated
relevant hosp =  8
query #428: 55 recommendations generated
relevant hosp =  6
query #429: 158 recommendations generated
relevant hosp =  25
query #430: 55 recommendations generated
relevant hosp =  6
query #431: 50 recommendations generated
relevant hosp =  4
query #432: 55 recommendations generated
relevant hosp =  11
query #433: 2 recommendations generated
relevant hosp =  2
query #434: 29 recommendations generated
relevant hosp =  1
query #435: 60 re

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return func(*args, **kwargs)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return func(*args, **kwargs)


relevant hosp =  8
query #441: 89 recommendations generated
relevant hosp =  6
query #442: 158 recommendations generated
relevant hosp =  25
query #443: 89 recommendations generated
relevant hosp =  10
query #444: 21 recommendations generated
relevant hosp =  5
query #445: 55 recommendations generated
relevant hosp =  11
query #446: 9 recommendations generated
relevant hosp =  1
query #447: 49 recommendations generated
relevant hosp =  4
query #448: 61 recommendations generated
relevant hosp =  6
query #449: 75 recommendations generated
relevant hosp =  8
query #450: 48 recommendations generated
relevant hosp =  16
query #451: 158 recommendations generated
relevant hosp =  25
query #452: 6 recommendations generated
relevant hosp =  4
query #453: 49 recommendations generated
relevant hosp =  4
query #454: 3 recommendations generated
relevant hosp =  3
query #455: 23 recommendations generated
relevant hosp =  2
query #456: 134 recommendations generated
relevant hosp =  7
query #457: 61 r

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return func(*args, **kwargs)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return func(*args, **kwargs)


query #463: 54 recommendations generated
relevant hosp =  13
query #464: 273 recommendations generated
relevant hosp =  38
query #465: 10 recommendations generated
relevant hosp =  1
query #466: 53 recommendations generated
relevant hosp =  12
query #467: 53 recommendations generated
relevant hosp =  12
query #468: 74 recommendations generated
relevant hosp =  2
query #469: 12 recommendations generated
relevant hosp =  7
query #470: 23 recommendations generated
relevant hosp =  3
query #471: 26 recommendations generated
relevant hosp =  2
query #472: 29 recommendations generated
relevant hosp =  1
query #473: 63 recommendations generated
relevant hosp =  3
query #474: 22 recommendations generated
relevant hosp =  3
query #475: 49 recommendations generated
relevant hosp =  4
query #476: 84 recommendations generated
relevant hosp =  25
query #477: 2 recommendations generated
relevant hosp =  2
query #478: 12 recommendations generated
relevant hosp =  7
query #479: 84 recommendations gene

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return func(*args, **kwargs)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return func(*args, **kwargs)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy


query #485: 11 recommendations generated
relevant hosp =  2
query #486: 10 recommendations generated
relevant hosp =  1
query #487: 79 recommendations generated
relevant hosp =  7
query #488: 74 recommendations generated
relevant hosp =  2
query #489: 136 recommendations generated
relevant hosp =  21
query #490: 55 recommendations generated
relevant hosp =  11
query #491: 60 recommendations generated
relevant hosp =  12
query #492: 24 recommendations generated
relevant hosp =  2
query #493: 273 recommendations generated
relevant hosp =  38
query #494: 79 recommendations generated
relevant hosp =  7
query #495: 6 recommendations generated
relevant hosp =  4
query #496: 47 recommendations generated
relevant hosp =  2
query #497: 50 recommendations generated
relevant hosp =  6
query #498: 225 recommendations generated
relevant hosp =  26
query #499: 273 recommendations generated
relevant hosp =  38
query #500: 24 recommendations generated
relevant hosp =  2


A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return func(*args, **kwargs)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return func(*args, **kwargs)


In [33]:
#df_ndcg_500 = pd.DataFrame(list(ndcg_500.items()), columns=['query', 'nDCG'])
df_viz = q_500.copy()
df_viz['nDCG'] = ndcg_500.values()
df_viz.head()

Unnamed: 0,State,Doctors,Nurses,Patients,Staffs,nDCG
0,UT,99,100,72,64,0.102655
1,IN,86,93,55,89,0.129176
2,AL,68,100,65,59,0.298925
3,KY,67,69,99,86,0.298925
4,MA,64,61,62,44,0.102655


In [42]:
#!python -m pip install --upgrade pip

In [41]:
#! pip install vega

In [39]:
import altair as alt
alt.renderers.enable('default')

RendererRegistry.enable('default')

In [40]:
alt.Chart(df_viz).mark_bar().encode(
    x=alt.X("nDCG:Q", bin=True),
    y="count()",
).properties(title="Histogram of nDCG for 500 Test Queries")

In [47]:
q_5k = rand_query_generator(new, n=5000)
ndcg_5k = evaluation_nDCG_at_n(new, q_5k)

df_5k = q_5k.copy()
df_5k['nDCG'] = ndcg_5k.values()
df_5k.head()

Output hidden; open in https://colab.research.google.com to view.

In [48]:
alt.Chart(df_5k).mark_bar().encode(
    x=alt.X("nDCG:Q", bin=True),
    y="count()",
).properties(title="Histogram of nDCG for 5000 Test Queries")