**Medicine recommendation system**

**1. Installer les bibliothèques nécessaires**



In [None]:
!pip install pandas numpy scikit-learn flask flask_sqlalchemy flask_ngrok




**2. Importer les bibliothèques**

In [None]:
import pandas as pd
import numpy as np
from sklearn.preprocessing import StandardScaler, OneHotEncoder
from sklearn.neighbors import NearestNeighbors
from sklearn.compose import ColumnTransformer
from sklearn.pipeline import Pipeline

**3. Charger et prétraiter les données**

****

In [None]:
medicines_df = pd.read_csv('/content/medicines.csv')
medicines_df.dropna(inplace=True)
medicines_df.head()


Unnamed: 0,Medicine Name,Composition,Uses,Side_effects,Image URL,Manufacturer,Excellent Review %,Average Review %,Poor Review %
0,Avastin 400mg Injection,Bevacizumab (400mg),Cancer of colon and rectum Non-small cell lun...,Rectal bleeding Taste change Headache Noseblee...,"https://onemg.gumlet.io/l_watermark_346,w_480,...",Roche Products India Pvt Ltd,22,56,22
1,Augmentin 625 Duo Tablet,Amoxycillin (500mg) + Clavulanic Acid (125mg),Treatment of Bacterial infections,Vomiting Nausea Diarrhea Mucocutaneous candidi...,"https://onemg.gumlet.io/l_watermark_346,w_480,...",Glaxo SmithKline Pharmaceuticals Ltd,47,35,18
2,Azithral 500 Tablet,Azithromycin (500mg),Treatment of Bacterial infections,Nausea Abdominal pain Diarrhea,"https://onemg.gumlet.io/l_watermark_346,w_480,...",Alembic Pharmaceuticals Ltd,39,40,21
3,Ascoril LS Syrup,Ambroxol (30mg/5ml) + Levosalbutamol (1mg/5ml)...,Treatment of Cough with mucus,Nausea Vomiting Diarrhea Upset stomach Stomach...,"https://onemg.gumlet.io/l_watermark_346,w_480,...",Glenmark Pharmaceuticals Ltd,24,41,35
4,Aciloc 150 Tablet,Ranitidine (150mg),Treatment of Gastroesophageal reflux disease (...,Headache Diarrhea Gastrointestinal disturbance,"https://onemg.gumlet.io/l_watermark_346,w_480,...",Cadila Pharmaceuticals Ltd,34,37,29


In [None]:
medicines_df.shape

(11825, 9)

In [None]:
medicines_df.info() # no null values

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 11825 entries, 0 to 11824
Data columns (total 9 columns):
 #   Column              Non-Null Count  Dtype 
---  ------              --------------  ----- 
 0   Medicine Name       11825 non-null  object
 1   Composition         11825 non-null  object
 2   Uses                11825 non-null  object
 3   Side_effects        11825 non-null  object
 4   Image URL           11825 non-null  object
 5   Manufacturer        11825 non-null  object
 6   Excellent Review %  11825 non-null  int64 
 7   Average Review %    11825 non-null  int64 
 8   Poor Review %       11825 non-null  int64 
dtypes: int64(3), object(6)
memory usage: 831.6+ KB


**Supprimer les lignes avec des valeurs manquantes**

In [None]:
medicines_df.dropna(subset=['Medicine Name', 'Uses', 'Side_effects', 'Manufacturer', 'Excellent Review %', 'Average Review %', 'Poor Review %'], inplace=True)

**4. Préparer le pipeline de transformation et le modèle KNN**

In [None]:
categorical_columns = ['Uses', 'Side_effects', 'Manufacturer']
numerical_columns = ['Excellent Review %', 'Average Review %', 'Poor Review %']
all_columns = ['Medicine Name'] + categorical_columns + numerical_columns
medicines_df = medicines_df[all_columns]

In [None]:
preprocessor = ColumnTransformer(
    transformers=[
        ('cat', OneHotEncoder(handle_unknown='ignore'), categorical_columns),
        ('num', StandardScaler(), numerical_columns)
    ])
preprocessor.fit(medicines_df[categorical_columns + numerical_columns])

In [None]:
X = preprocessor.transform(medicines_df[categorical_columns + numerical_columns])

**Création et entraînement du modèle KNN**

In [None]:
knn = NearestNeighbors(n_neighbors=5, metric='euclidean')
knn.fit(X)

**5. Fonction de recommandation**

In [None]:
def recommend_medicines(input_data):
    # Convertir les données d'entrée en DataFrame
    input_df = pd.DataFrame([input_data], columns=categorical_columns + numerical_columns)

    # Transformer les données d'entrée
    input_data_processed = preprocessor.transform(input_df)

    # Utilisation du modèle KNN pour trouver les médicaments les plus proches
    distances, indices = knn.kneighbors(input_data_processed)

    # Utiliser les indices pour récupérer les médicaments du DataFrame original
    nearest_medicines = medicines_df.iloc[indices[0]]
    return nearest_medicines

**6. Tester la fonction de recommandation**

In [None]:
# Exemple d'utilisation de la fonction de recommandation
input_data = ['Cancer', 'nausea', 'Roche Products India Pvt Ltd', 22, 56, 22]
recommended_medicines = recommend_medicines(input_data)

# Afficher les résultats des recommandations
print("Médicaments recommandés :")
for idx, row in recommended_medicines.iterrows():
    print(f"Nom du médicament: {row['Medicine Name']}")
    print(f"Utilisations: {row['Uses']}")
    print(f"Effets secondaires: {row['Side_effects']}")
    print(f"Fabricant: {row['Manufacturer']}")
    print(f"Excellente évaluation : {row['Excellent Review %']}%")
    print(f"Évaluation moyenne : {row['Average Review %']}%")
    print(f"Mauvaise évaluation : {row['Poor Review %']}%")
    print("-" * 30)


Médicaments recommandés :
Nom du médicament: Avastin 400mg Injection
Utilisations:  Cancer of colon and rectum Non-small cell lung cancer Kidney cancer Brain tumor Ovarian cancer Cervical cancer
Effets secondaires: Rectal bleeding Taste change Headache Nosebleeds Back pain Dry skin High blood pressure Protein in urine Inflammation of the nose
Fabricant: Roche Products India Pvt Ltd
Excellente évaluation : 22%
Évaluation moyenne : 56%
Mauvaise évaluation : 22%
------------------------------
Nom du médicament: Perjeta 420mg Injection
Utilisations:  Breast cancer
Effets secondaires: Diarrhea Hair loss Feeling sick Rash Stomach inflammation Decreased blood cells red cells white cells and platelets Muscle pain Cough Heartburn Breathlessness Dizziness Fatigue
Fabricant: Roche Products India Pvt Ltd
Excellente évaluation : 0%
Évaluation moyenne : 56%
Mauvaise évaluation : 44%
------------------------------
Nom du médicament: Aztolet  20 Tablet
Utilisations:  Heart attack prevention and high c