---
# Patient's Condition Classification Using Drug Reviews
---
### Problem Statement:

**Business Objective:**
This is a dataset which consists of 161297 drug name, condition, reviews and ratings from different patients.

**Our *goal* is to examine how patients are feeling using the drugs their positive and negative experiences so
that we can *recommend* him a suitable drug**. 

**By analyzing the reviews, we can understand the drug *effectiveness* and its side effects.**

---
**Data Description:**

The dataset provides patient reviews on specific drugs along with related
conditions and a 10 star patient rating reflecting overall patient satisfaction.


So in this dataset, we can see many patients conditions but we will focus only on
the below, 

**Classify the below conditions from the patients reviews**

a. Depression 

b. High Blood Pressure

c. Diabetes, Type 2

Attribute Information:

1. DrugName (categorical)  : name of drug
2. condition (categorical) : name of condition
3. review (text)           : patient review
4. rating (numerical)      : 10 star patient rating
5. date (date)             : date of review entry
6. usefulCount (numerical) : number of users who found review useful
---

The Business objective is two fold i.e. 

1. Patient's  Condition Classification using Drug Reviews,

2. Based on the Condition, recomending top drugs based on the sentiment analysis.

First case is very simple, here we extract features from reviews by using NLP concepts and ML algorithms to classsify the condition.  

But in the second case we need to find sentiment associated with each drug review. And we have two other feature such as 'rating', 'usefulCount'. we have to use these two features to find overall sentiment associated with the drug.

So we **define** a new **measure** called **'effectiveness'**.

**effectiveness = Mean w.r.to Drug Count[0.5(Scaled Sentiment Associated With Review) +0.5 (Sentiment From Rating)] * (Mean w.r.to UsefulCount)**

**Explanation of the Measure:**

Effectiveness score can not lies betweem -1 to 1 as in the case of sentiment, it can be any value!
Scaled sentiment of review lies between 1 to 10 and we take rating as it is for Sentiment from Rating , since the rating lies between 1 to 10.

**Overall_sentiment** associated with the Medicine is avarage of Scaled sentiment associated with Review and Rating value.

In the data set each Medicine appears more then once, so we need to find mean of oveall_sentiment with respect to drug_count. This is **mean_overall_sentiment** associated with each Medicine. Similarly we need to find **mean_usefull_count** which is just mean of usefullcount with respect to drug_count.

so the effectiveness is multiplication of these two i.e. mean_oveall_sentiment and mean_usefull_count.

**effectiveness = mean_overall_sentiment * mean_usefull_count**



Based on this feature we Recomond Madicine. 


---
There are a few questions need's to be addressed before going into EDA,

**Why do we need to find Condition based on Reviews (since we already know the condition before taking drug and giving Review)?**

ANS. Unavailability of Label('Condition') data.

**Is it possible to classify the condition based on the Reviews?** 

ANS: Word Clouds

****


Note that in this notebook we are only looking at Medicine recomendation system based on sentiment analysis!

## 1.Importing Libraries

In [2]:
# Importing basic libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

from sklearn.preprocessing import MinMaxScaler

# Libraries needed for text preprocessing and sentiment analysis
import string
import nltk
from nltk import pos_tag
from nltk.corpus import stopwords
from sklearn.feature_extraction.text import ENGLISH_STOP_WORDS
from nltk.tokenize import WhitespaceTokenizer
from nltk.stem import WordNetLemmatizer
from nltk.corpus import wordnet
from wordcloud import WordCloud

import warnings
warnings.filterwarnings("ignore")

In [None]:
# Libraries for sentiment analysis
import spacy
from spacytextblob.spacytextblob import SpacyTextBlob
from tqdm import tqdm
nlp = spacy.load('en_core_web_sm')
nlp.add_pipe('spacytextblob')


## 2. Importing Dataset 

In [None]:
drug_review_data = pd.read_csv("drugsCom_raw.tsv", sep='\t')
drug_review_data

Unnamed: 0.1,Unnamed: 0,drugName,condition,review,rating,date,usefulCount
0,206461,Valsartan,Left Ventricular Dysfunction,"""It has no side effect, I take it in combinati...",9.0,"May 20, 2012",27
1,95260,Guanfacine,ADHD,"""My son is halfway through his fourth week of ...",8.0,"April 27, 2010",192
2,92703,Lybrel,Birth Control,"""I used to take another oral contraceptive, wh...",5.0,"December 14, 2009",17
3,138000,Ortho Evra,Birth Control,"""This is my first time using any form of birth...",8.0,"November 3, 2015",10
4,35696,Buprenorphine / naloxone,Opiate Dependence,"""Suboxone has completely turned my life around...",9.0,"November 27, 2016",37
...,...,...,...,...,...,...,...
161292,191035,Campral,Alcohol Dependence,"""I wrote my first report in Mid-October of 201...",10.0,"May 31, 2015",125
161293,127085,Metoclopramide,Nausea/Vomiting,"""I was given this in IV before surgey. I immed...",1.0,"November 1, 2011",34
161294,187382,Orencia,Rheumatoid Arthritis,"""Limited improvement after 4 months, developed...",2.0,"March 15, 2014",35
161295,47128,Thyroid desiccated,Underactive Thyroid,"""I&#039;ve been on thyroid medication 49 years...",10.0,"September 19, 2015",79


#### 2.1 Extracting data with specific conditions (Depression, High Blood Pressure, Diabetes, Type 2)

In [None]:
data_specific = drug_review_data[(drug_review_data['condition'] == 'Depression') |
                                 (drug_review_data['condition'] == 'High Blood Pressure') | 
                                 (drug_review_data['condition'] == 'Diabetes, Type 2')]
data_specific

Unnamed: 0.1,Unnamed: 0,drugName,condition,review,rating,date,usefulCount
11,75612,L-methylfolate,Depression,"""I have taken anti-depressants for years, with...",10.0,"March 9, 2017",54
31,96233,Sertraline,Depression,"""1 week on Zoloft for anxiety and mood swings....",8.0,"May 7, 2011",3
44,121333,Venlafaxine,Depression,"""my gp started me on Venlafaxine yesterday to ...",4.0,"April 27, 2016",3
50,156544,Dulaglutide,"Diabetes, Type 2","""Hey Guys, It&#039;s been 4 months since my l...",10.0,"October 24, 2017",24
67,131909,Effexor XR,Depression,"""This medicine saved my life. I was at my wits...",10.0,"June 20, 2013",166
...,...,...,...,...,...,...,...
161251,198130,Metoprolol,High Blood Pressure,"""It is fourth blood pressure pill for me. It f...",4.0,"July 15, 2009",39
161258,34443,Bystolic,High Blood Pressure,"""While on Bystolic my feet and arms were numb....",1.0,"July 18, 2009",25
161278,86533,Invokana,"Diabetes, Type 2","""I just got diagnosed with type 2. My doctor p...",9.0,"October 23, 2015",47
161286,93069,Vortioxetine,Depression,"""This is the third med I&#039;ve tried for anx...",2.0,"July 17, 2016",33


In [None]:
data_specific[['date']] =data_specific[['date']].apply(pd.to_datetime)

In [None]:
data_specific['no_of_days'] = (data_specific['date'].max() - data_specific['date']).dt.days

In [None]:
# let's define new feature as No.of days since the review has been given
data_specific

Unnamed: 0.1,Unnamed: 0,drugName,condition,review,rating,date,usefulCount,no_of_days
11,75612,L-methylfolate,Depression,"""I have taken anti-depressants for years, with...",10.0,2017-03-09,54,278
31,96233,Sertraline,Depression,"""1 week on Zoloft for anxiety and mood swings....",8.0,2011-05-07,3,2411
44,121333,Venlafaxine,Depression,"""my gp started me on Venlafaxine yesterday to ...",4.0,2016-04-27,3,594
50,156544,Dulaglutide,"Diabetes, Type 2","""Hey Guys, It&#039;s been 4 months since my l...",10.0,2017-10-24,24,49
67,131909,Effexor XR,Depression,"""This medicine saved my life. I was at my wits...",10.0,2013-06-20,166,1636
...,...,...,...,...,...,...,...,...
161251,198130,Metoprolol,High Blood Pressure,"""It is fourth blood pressure pill for me. It f...",4.0,2009-07-15,39,3072
161258,34443,Bystolic,High Blood Pressure,"""While on Bystolic my feet and arms were numb....",1.0,2009-07-18,25,3069
161278,86533,Invokana,"Diabetes, Type 2","""I just got diagnosed with type 2. My doctor p...",9.0,2015-10-23,47,781
161286,93069,Vortioxetine,Depression,"""This is the third med I&#039;ve tried for anx...",2.0,2016-07-17,33,513


#### 2.2 Final dataframe by removing not useful columns i.e. Unnamed: 0 and date

In [None]:
columns = ['Unnamed: 0', 'date']
df0 = data_specific.drop(columns= columns).set_index(np.arange(0,len(data_specific)))
df0

Unnamed: 0,drugName,condition,review,rating,usefulCount,no_of_days
0,L-methylfolate,Depression,"""I have taken anti-depressants for years, with...",10.0,54,278
1,Sertraline,Depression,"""1 week on Zoloft for anxiety and mood swings....",8.0,3,2411
2,Venlafaxine,Depression,"""my gp started me on Venlafaxine yesterday to ...",4.0,3,594
3,Dulaglutide,"Diabetes, Type 2","""Hey Guys, It&#039;s been 4 months since my l...",10.0,24,49
4,Effexor XR,Depression,"""This medicine saved my life. I was at my wits...",10.0,166,1636
...,...,...,...,...,...,...
13939,Metoprolol,High Blood Pressure,"""It is fourth blood pressure pill for me. It f...",4.0,39,3072
13940,Bystolic,High Blood Pressure,"""While on Bystolic my feet and arms were numb....",1.0,25,3069
13941,Invokana,"Diabetes, Type 2","""I just got diagnosed with type 2. My doctor p...",9.0,47,781
13942,Vortioxetine,Depression,"""This is the third med I&#039;ve tried for anx...",2.0,33,513


In [None]:
# In general we can expect that if the no.of days from the review date increases then the the usefulCount also increases,
#  but we need statistical justification for this argument.
# To see the correlation between no.of days and useful count
df0.corr()

# There is NO high correlation between no.of days and usefulCount, so we can left this column!

Unnamed: 0,rating,usefulCount,no_of_days
rating,1.0,0.243938,0.166209
usefulCount,0.243938,1.0,0.217419
no_of_days,0.166209,0.217419,1.0


In [None]:
df1 = df0.drop(columns='no_of_days')
df1

Unnamed: 0,drugName,condition,review,rating,usefulCount
0,L-methylfolate,Depression,"""I have taken anti-depressants for years, with...",10.0,54
1,Sertraline,Depression,"""1 week on Zoloft for anxiety and mood swings....",8.0,3
2,Venlafaxine,Depression,"""my gp started me on Venlafaxine yesterday to ...",4.0,3
3,Dulaglutide,"Diabetes, Type 2","""Hey Guys, It&#039;s been 4 months since my l...",10.0,24
4,Effexor XR,Depression,"""This medicine saved my life. I was at my wits...",10.0,166
...,...,...,...,...,...
13939,Metoprolol,High Blood Pressure,"""It is fourth blood pressure pill for me. It f...",4.0,39
13940,Bystolic,High Blood Pressure,"""While on Bystolic my feet and arms were numb....",1.0,25
13941,Invokana,"Diabetes, Type 2","""I just got diagnosed with type 2. My doctor p...",9.0,47
13942,Vortioxetine,Depression,"""This is the third med I&#039;ve tried for anx...",2.0,33


In [None]:
df2 = pd.merge(df1, df1.groupby(['drugName']).size().reset_index(name='drug_count'), on = 'drugName', how = 'left')
df2
# drug_count = no.of times specific drug is used in the dataset
# This feature can be used to predict the confidence intervel(ex: 95%) for Avarage rating for a specific drug

Unnamed: 0,drugName,condition,review,rating,usefulCount,drug_count
0,L-methylfolate,Depression,"""I have taken anti-depressants for years, with...",10.0,54,66
1,Sertraline,Depression,"""1 week on Zoloft for anxiety and mood swings....",8.0,3,459
2,Venlafaxine,Depression,"""my gp started me on Venlafaxine yesterday to ...",4.0,3,437
3,Dulaglutide,"Diabetes, Type 2","""Hey Guys, It&#039;s been 4 months since my l...",10.0,24,231
4,Effexor XR,Depression,"""This medicine saved my life. I was at my wits...",10.0,166,133
...,...,...,...,...,...,...
13939,Metoprolol,High Blood Pressure,"""It is fourth blood pressure pill for me. It f...",4.0,39,92
13940,Bystolic,High Blood Pressure,"""While on Bystolic my feet and arms were numb....",1.0,25,99
13941,Invokana,"Diabetes, Type 2","""I just got diagnosed with type 2. My doctor p...",9.0,47,143
13942,Vortioxetine,Depression,"""This is the third med I&#039;ve tried for anx...",2.0,33,345


In [None]:
# These observations can not be useful for determing confidence intervels
df2[df2['drug_count']<=20]

Unnamed: 0,drugName,condition,review,rating,usefulCount,drug_count
36,Prazosin,High Blood Pressure,"""Using this for parasomnia, bph and bp. Kind o...",6.0,2,6
63,Rexulti,Depression,"""I was prescribed Rexulti when my daily depres...",10.0,45,16
96,Emsam,Depression,"""Tried Effexor, Wellbutrin, Paxil, Zoloft in t...",8.0,51,15
103,Toujeo,"Diabetes, Type 2","""Getting better results than with Lantus. Use...",9.0,8,9
109,Empagliflozin / linagliptin,"Diabetes, Type 2","""I am a type two that went into Diabetic ketoa...",10.0,10,7
...,...,...,...,...,...,...
13918,Twynsta,High Blood Pressure,"""Awesome medicine. Have been on it for 3 years...",10.0,13,6
13928,Aldomet,High Blood Pressure,"""I&#039;m sure this medicine has its place in ...",3.0,24,1
13935,Rexulti,Depression,"""in my third week of 05 mg as add on to Paxil,...",8.0,41,16
13936,Seroquel,Depression,"""I have been on Seroquel for several years and...",10.0,36,19


In [None]:
# user_size = No of times the condition is used in the dataset
df = pd.merge(df2, df2.groupby(['condition']).size().reset_index(name='condition_count'), on='condition', how='left')
df
# Condition_count = No. of times condition is used in the dataset

Unnamed: 0,drugName,condition,review,rating,usefulCount,drug_count,condition_count
0,L-methylfolate,Depression,"""I have taken anti-depressants for years, with...",10.0,54,66,9069
1,Sertraline,Depression,"""1 week on Zoloft for anxiety and mood swings....",8.0,3,459,9069
2,Venlafaxine,Depression,"""my gp started me on Venlafaxine yesterday to ...",4.0,3,437,9069
3,Dulaglutide,"Diabetes, Type 2","""Hey Guys, It&#039;s been 4 months since my l...",10.0,24,231,2554
4,Effexor XR,Depression,"""This medicine saved my life. I was at my wits...",10.0,166,133,9069
...,...,...,...,...,...,...,...
13939,Metoprolol,High Blood Pressure,"""It is fourth blood pressure pill for me. It f...",4.0,39,92,2321
13940,Bystolic,High Blood Pressure,"""While on Bystolic my feet and arms were numb....",1.0,25,99,2321
13941,Invokana,"Diabetes, Type 2","""I just got diagnosed with type 2. My doctor p...",9.0,47,143,2554
13942,Vortioxetine,Depression,"""This is the third med I&#039;ve tried for anx...",2.0,33,345,9069


**sentiment associated with review using spacy textblob pipeline**

referances:

https://spacy.io/universe/project/spacy-textblob

https://www.kaggle.com/code/bhuemims/recommendation-medicines-by-using-a-review#2.-Date-Preprocessing


In [None]:
# reviews = df['review']

# Predict_Sentiment = []
# for review in tqdm(reviews):
#     doc = nlp(review)
#     Predict_Sentiment += [doc._.blob.polarity]
# df["Predict_Sentiment"] = Predict_Sentiment
# df.head()

100%|██████████| 13944/13944 [16:14<00:00, 14.31it/s] 


Unnamed: 0,drugName,condition,review,rating,usefulCount,drug_count,condition_count,Predict_Sentiment
0,L-methylfolate,Depression,"""I have taken anti-depressants for years, with...",10.0,54,66,9069,0.275
1,Sertraline,Depression,"""1 week on Zoloft for anxiety and mood swings....",8.0,3,459,9069,0.166667
2,Venlafaxine,Depression,"""my gp started me on Venlafaxine yesterday to ...",4.0,3,437,9069,-0.136508
3,Dulaglutide,"Diabetes, Type 2","""Hey Guys, It&#039;s been 4 months since my l...",10.0,24,231,2554,0.103571
4,Effexor XR,Depression,"""This medicine saved my life. I was at my wits...",10.0,166,133,9069,0.168194


In [None]:
# # Saving the sentiments file
# df.to_csv("drugReviews_predicted_sentiments_updated.csv")

In [3]:
df = pd.read_csv("drugReviews_predicted_sentiments_updated.csv")
df

Unnamed: 0.1,Unnamed: 0,drugName,condition,review,rating,usefulCount,drug_count,condition_count,Predict_Sentiment
0,0,L-methylfolate,Depression,"""I have taken anti-depressants for years, with...",10.0,54,66,9069,0.275000
1,1,Sertraline,Depression,"""1 week on Zoloft for anxiety and mood swings....",8.0,3,459,9069,0.166667
2,2,Venlafaxine,Depression,"""my gp started me on Venlafaxine yesterday to ...",4.0,3,437,9069,-0.136508
3,3,Dulaglutide,"Diabetes, Type 2","""Hey Guys, It&#039;s been 4 months since my l...",10.0,24,231,2554,0.103571
4,4,Effexor XR,Depression,"""This medicine saved my life. I was at my wits...",10.0,166,133,9069,0.168194
...,...,...,...,...,...,...,...,...,...
13939,13939,Metoprolol,High Blood Pressure,"""It is fourth blood pressure pill for me. It f...",4.0,39,92,2321,-0.083333
13940,13940,Bystolic,High Blood Pressure,"""While on Bystolic my feet and arms were numb....",1.0,25,99,2321,-0.157937
13941,13941,Invokana,"Diabetes, Type 2","""I just got diagnosed with type 2. My doctor p...",9.0,47,143,2554,0.048611
13942,13942,Vortioxetine,Depression,"""This is the third med I&#039;ve tried for anx...",2.0,33,345,9069,-0.100694


In [4]:
df['scaled_sentiment_from_review'] = MinMaxScaler(feature_range=(1,10)).fit_transform(df[['Predict_Sentiment']])
df = df.drop(columns='Unnamed: 0')

In [5]:
df['overall_sentiment'] = 0.5*(df['scaled_sentiment_from_review'].values + df['rating'].values)
df

Unnamed: 0,drugName,condition,review,rating,usefulCount,drug_count,condition_count,Predict_Sentiment,scaled_sentiment_from_review,overall_sentiment
0,L-methylfolate,Depression,"""I have taken anti-depressants for years, with...",10.0,54,66,9069,0.275000,6.737500,8.368750
1,Sertraline,Depression,"""1 week on Zoloft for anxiety and mood swings....",8.0,3,459,9069,0.166667,6.250000,7.125000
2,Venlafaxine,Depression,"""my gp started me on Venlafaxine yesterday to ...",4.0,3,437,9069,-0.136508,4.885714,4.442857
3,Dulaglutide,"Diabetes, Type 2","""Hey Guys, It&#039;s been 4 months since my l...",10.0,24,231,2554,0.103571,5.966071,7.983036
4,Effexor XR,Depression,"""This medicine saved my life. I was at my wits...",10.0,166,133,9069,0.168194,6.256875,8.128438
...,...,...,...,...,...,...,...,...,...,...
13939,Metoprolol,High Blood Pressure,"""It is fourth blood pressure pill for me. It f...",4.0,39,92,2321,-0.083333,5.125000,4.562500
13940,Bystolic,High Blood Pressure,"""While on Bystolic my feet and arms were numb....",1.0,25,99,2321,-0.157937,4.789286,2.894643
13941,Invokana,"Diabetes, Type 2","""I just got diagnosed with type 2. My doctor p...",9.0,47,143,2554,0.048611,5.718750,7.359375
13942,Vortioxetine,Depression,"""This is the third med I&#039;ve tried for anx...",2.0,33,345,9069,-0.100694,5.046875,3.523438


In [6]:
df = pd.merge(df, df.groupby(['drugName'])['usefulCount'].mean().reset_index(name='mean_useful_count'), on='drugName', how='left')
df = pd.merge(df, df.groupby(['drugName'])['overall_sentiment'].mean().reset_index(name='mean_overall_sentiment'), on='drugName', how='left')
df.head()

Unnamed: 0,drugName,condition,review,rating,usefulCount,drug_count,condition_count,Predict_Sentiment,scaled_sentiment_from_review,overall_sentiment,mean_useful_count,mean_overall_sentiment
0,L-methylfolate,Depression,"""I have taken anti-depressants for years, with...",10.0,54,66,9069,0.275,6.7375,8.36875,71.075758,7.044704
1,Sertraline,Depression,"""1 week on Zoloft for anxiety and mood swings....",8.0,3,459,9069,0.166667,6.25,7.125,51.407407,6.489099
2,Venlafaxine,Depression,"""my gp started me on Venlafaxine yesterday to ...",4.0,3,437,9069,-0.136508,4.885714,4.442857,31.727689,6.102988
3,Dulaglutide,"Diabetes, Type 2","""Hey Guys, It&#039;s been 4 months since my l...",10.0,24,231,2554,0.103571,5.966071,7.983036,18.333333,5.739026
4,Effexor XR,Depression,"""This medicine saved my life. I was at my wits...",10.0,166,133,9069,0.168194,6.256875,8.128438,38.428571,6.67082


In [7]:
df.columns

Index(['drugName', 'condition', 'review', 'rating', 'usefulCount',
       'drug_count', 'condition_count', 'Predict_Sentiment',
       'scaled_sentiment_from_review', 'overall_sentiment',
       'mean_useful_count', 'mean_overall_sentiment'],
      dtype='object')

In [8]:
df_groups = df.groupby(['condition','drugName']).agg({'mean_overall_sentiment':['mean'], 'mean_useful_count':['mean']}).reset_index()
df_groups

Unnamed: 0_level_0,condition,drugName,mean_overall_sentiment,mean_useful_count
Unnamed: 0_level_1,Unnamed: 1_level_1,Unnamed: 2_level_1,mean,mean
0,Depression,Abilify,6.252332,68.690141
1,Depression,Alprazolam,7.711308,75.868852
2,Depression,Amitriptyline,7.299476,63.342105
3,Depression,Amitriptyline / chlordiazepoxide,8.000000,27.666667
4,Depression,Amoxapine,7.978125,2.500000
...,...,...,...,...
329,High Blood Pressure,Verapamil,5.669855,20.500000
330,High Blood Pressure,Verelan PM,8.195000,66.000000
331,High Blood Pressure,Zestoretic,7.900852,46.666667
332,High Blood Pressure,Zestril,5.733833,68.714286


In [9]:
df_groups['effectiveness'] = df_groups['mean_overall_sentiment'].values*df_groups['mean_useful_count'].values
df_groups

Unnamed: 0_level_0,condition,drugName,mean_overall_sentiment,mean_useful_count,effectiveness
Unnamed: 0_level_1,Unnamed: 1_level_1,Unnamed: 2_level_1,mean,mean,Unnamed: 5_level_1
0,Depression,Abilify,6.252332,68.690141,429.473564
1,Depression,Alprazolam,7.711308,75.868852,585.048080
2,Depression,Amitriptyline,7.299476,63.342105,462.364171
3,Depression,Amitriptyline / chlordiazepoxide,8.000000,27.666667,221.333333
4,Depression,Amoxapine,7.978125,2.500000,19.945312
...,...,...,...,...,...
329,High Blood Pressure,Verapamil,5.669855,20.500000,116.232026
330,High Blood Pressure,Verelan PM,8.195000,66.000000,540.870000
331,High Blood Pressure,Zestoretic,7.900852,46.666667,368.706439
332,High Blood Pressure,Zestril,5.733833,68.714286,393.996233


In [16]:
# Top 5 Medicines for Depression
df_groups[df_groups['condition']=='Depression'].sort_values(by = 'effectiveness', ignore_index = True, ascending = False).head()

Unnamed: 0_level_0,condition,drugName,mean_overall_sentiment,mean_useful_count,effectiveness
Unnamed: 0_level_1,Unnamed: 1_level_1,Unnamed: 2_level_1,mean,mean,Unnamed: 5_level_1
0,Depression,Methylin ER,8.133281,181.0,1472.123906
1,Depression,Provigil,7.70723,117.0,901.745923
2,Depression,Desyrel,6.85059,128.0,876.875556
3,Depression,Elavil,7.242677,99.666667,721.853425
4,Depression,Norpramin,8.007386,89.0,712.657386


In [20]:
# Top 5 Medicines for Diabetes,Type 2
df_groups[df_groups['condition']=='Diabetes, Type 2'].sort_values(by = 'effectiveness', ignore_index = True, ascending = False).head()

Unnamed: 0_level_0,condition,drugName,mean_overall_sentiment,mean_useful_count,effectiveness
Unnamed: 0_level_1,Unnamed: 1_level_1,Unnamed: 2_level_1,mean,mean,Unnamed: 5_level_1
0,"Diabetes, Type 2",Glucophage XR,8.086538,184.0,1487.923077
1,"Diabetes, Type 2",Chromium picolinate,7.748611,116.666667,904.00463
2,"Diabetes, Type 2",GlipiZIDE XL,7.337386,92.5,678.708239
3,"Diabetes, Type 2",Glucotrol,6.85,93.428571,639.985714
4,"Diabetes, Type 2",Amaryl,7.142279,75.3,537.81364


In [21]:
# Top 5 Medicines for High Blood Pressure
df_groups[df_groups['condition']=='High Blood Pressure'].sort_values(by = 'effectiveness', ignore_index = True, ascending = False).head()

Unnamed: 0_level_0,condition,drugName,mean_overall_sentiment,mean_useful_count,effectiveness
Unnamed: 0_level_1,Unnamed: 1_level_1,Unnamed: 2_level_1,mean,mean,Unnamed: 5_level_1
0,High Blood Pressure,Altace,6.194148,112.25,695.293082
1,High Blood Pressure,Aldactone,5.244934,129.333333,678.344747
2,High Blood Pressure,Cozaar,5.915381,107.939394,638.502636
3,High Blood Pressure,Norvasc,5.121702,110.921053,568.104624
4,High Blood Pressure,Verelan PM,8.195,66.0,540.87


END.