# Data Science:: Healthcare - Persistency of a drug:: Group Project

## by ..., .... - April ....., 2021 

https://en.wikipedia.org/wiki/Nontuberculous_mycobacteria

Nontuberculous mycobacteria (NTM), also known as environmental mycobacteria, atypical mycobacteria and mycobacteria other than tuberculosis (MOTT), are mycobacteria which do not cause tuberculosis or leprosy (also known as Hansen's disease). NTM do cause pulmonary diseases that resemble tuberculosis. Mycobacteriosis is any of these illnesses, usually meant to exclude tuberculosis. They occur in many animals, including humans.

**Epidemiology**

NTM are widely distributed in the environment, particularly in wet soil, marshland, streams, rivers and estuaries. Different species of NTM prefer different types of environment. Human disease is believed to be acquired from environmental exposures. Unlike tuberculosis and leprosy, animal-to-human or human-to-human transmission of NTM rarely occurs.

NTM diseases have been seen in most industrialized countries, where incidence rates vary from 1.0 to 1.8 cases per 100,000 persons. Recent studies, including one done in Ontario, Canada, suggest that incidence is much higher. Pulmonary NTM is estimated by some experts in the field to be at least ten times more common than TB in the U.S., with at least 150,000 cases per year.

**Pathogenesis**

The most common clinical manifestation of NTM disease is lung disease, but lymphatic, skin/soft tissue, and disseminated diseases are also important.

Pulmonary disease caused by NTM is most often seen in postmenopausal women and patients with underlying lung disease such as cystic fibrosis, bronchiectasis, and prior tuberculosis. It is not uncommon for alpha 1-antitrypsin deficiency, Marfan syndrome, and primary ciliary dyskinesia patients to have pulmonary NTM colonization and/or infection. Pulmonary NTM can also be found in individuals with AIDS and malignant disease. It can be caused by many NTM species, which depends on region, but most frequently MAC and M. kansasii.

Clinical symptoms vary in scope and intensity, but commonly include chronic cough, often with purulent sputum. Hemoptysis may also be present. Systemic symptoms include malaise, fatigue, and weight loss in advanced disease. The diagnosis of M. abscessus pulmonary infection requires the presence of symptoms, radiologic abnormalities, and microbiologic cultures.

**Diagnosis**

Diagnosis of opportunistic mycobacteria is made by repeated isolation and identification of the pathogen with compatible clinical and radiological features. Similar to M. tuberculosis, most nontuberculous mycobacteria can be detected microscopically and grow on Löwenstein-Jensen medium. Many reference centres now use a nucleic acid-based method such as sequence differences detection in the gene coding for 16S ribosomal RNA to identify the species.

Pulmonary NTM disease diagnosis requires both identification of the mycobacterium in the patient's lung(s), as well as a high-resolution CT scan of the lungs.

In [2]:
# Loading necessary libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import datetime as dt
import sweetviz as sv
%matplotlib inline
print("Libraries loaded")

Libraries loaded


In [9]:
drug = pd.read_excel('Healthcare_dataset_1.xls')

In [10]:
drug.shape

(3424, 69)

In [11]:
drug.head()

Unnamed: 0,Ptid,Persistency_Flag,Gender,Race,Ethnicity,Region,Age_Bucket,Ntm_Speciality,Ntm_Specialist_Flag,Ntm_Speciality_Bucket,...,Risk_Family_History_Of_Osteoporosis,Risk_Low_Calcium_Intake,Risk_Vitamin_D_Insufficiency,Risk_Poor_Health_Frailty,Risk_Excessive_Thinness,Risk_Hysterectomy_Oophorectomy,Risk_Estrogen_Deficiency,Risk_Immobilization,Risk_Recurring_Falls,Count_Of_Risks
0,P1,Persistent,Male,Caucasian,Not Hispanic,West,>75,GENERAL PRACTITIONER,Others,OB/GYN/Others/PCP/Unknown,...,N,N,N,N,N,N,N,N,N,0
1,P2,Non-Persistent,Male,Asian,Not Hispanic,West,55-65,GENERAL PRACTITIONER,Others,OB/GYN/Others/PCP/Unknown,...,N,N,N,N,N,N,N,N,N,0
2,P3,Non-Persistent,Female,Other/Unknown,Hispanic,Midwest,65-75,GENERAL PRACTITIONER,Others,OB/GYN/Others/PCP/Unknown,...,N,Y,N,N,N,N,N,N,N,2
3,P4,Non-Persistent,Female,Caucasian,Not Hispanic,Midwest,>75,GENERAL PRACTITIONER,Others,OB/GYN/Others/PCP/Unknown,...,N,N,N,N,N,N,N,N,N,1
4,P5,Non-Persistent,Female,Caucasian,Not Hispanic,Midwest,>75,GENERAL PRACTITIONER,Others,OB/GYN/Others/PCP/Unknown,...,N,N,N,N,N,N,N,N,N,1


In [12]:
drug.columns

Index(['Ptid', 'Persistency_Flag', 'Gender', 'Race', 'Ethnicity', 'Region',
       'Age_Bucket', 'Ntm_Speciality', 'Ntm_Specialist_Flag',
       'Ntm_Speciality_Bucket', 'Gluco_Record_Prior_Ntm',
       'Gluco_Record_During_Rx', 'Dexa_Freq_During_Rx', 'Dexa_During_Rx',
       'Frag_Frac_Prior_Ntm', 'Frag_Frac_During_Rx', 'Risk_Segment_Prior_Ntm',
       'Tscore_Bucket_Prior_Ntm', 'Risk_Segment_During_Rx',
       'Tscore_Bucket_During_Rx', 'Change_T_Score', 'Change_Risk_Segment',
       'Adherent_Flag', 'Idn_Indicator', 'Injectable_Experience_During_Rx',
       'Comorb_Encounter_For_Screening_For_Malignant_Neoplasms',
       'Comorb_Encounter_For_Immunization',
       'Comorb_Encntr_For_General_Exam_W_O_Complaint,_Susp_Or_Reprtd_Dx',
       'Comorb_Vitamin_D_Deficiency',
       'Comorb_Other_Joint_Disorder_Not_Elsewhere_Classified',
       'Comorb_Encntr_For_Oth_Sp_Exam_W_O_Complaint_Suspected_Or_Reprtd_Dx',
       'Comorb_Long_Term_Current_Drug_Therapy', 'Comorb_Dorsalgia',
       'Com

In [13]:
my_report = sv.analyze(drug)
my_report.show_html()

                                             |                                             | [  0%]   00:00 ->…

Report SWEETVIZ_REPORT.html was generated! NOTEBOOK/COLAB USERS: the web browser MAY not pop up, regardless, the report IS saved in your notebook/colab files.


In [23]:
drug.dtypes

Ptid                              object
Persistency_Flag                  object
Gender                            object
Race                              object
Ethnicity                         object
                                   ...  
Risk_Hysterectomy_Oophorectomy    object
Risk_Estrogen_Deficiency          object
Risk_Immobilization               object
Risk_Recurring_Falls              object
Count_Of_Risks                     int64
Length: 69, dtype: object

In [20]:
drug.duplicated(subset=None, keep='first').sum()

0

In [30]:
drug['Risk_Hysterectomy_Oophorectomy'].values

array(['N', 'N', 'N', ..., 'N', 'N', 'N'], dtype=object)

In [33]:
drug.Risk_Hysterectomy_Oophorectomy.replace(to_replace=['N', 'Y'], value=[0, 1], inplace=True)

In [34]:
drug['Risk_Hysterectomy_Oophorectomy'].values

array([0, 0, 0, ..., 0, 0, 0], dtype=int64)