## Medical Device Recalls

### Step 2 - Explore Data

- Class I: A situation where there is a reasonable chance that a product will cause serious health problems or death.

- Class II: A situation where a product may cause a temporary or reversible health problem or where there is a slight chance that it will cause serious health problems or death.

- Class III: A situation where a product is not likely to cause any health problem or injury.

In [1]:
import pandas as pd
import plotly.express as px

pd.set_option("display.max_colwidth",None)

In [2]:
df = pd.read_csv("../data/source/enforcement_reports.csv")
df.head(1).T

Unnamed: 0,0
country,United States
city,San Clemente
address_1,951 Calle Amanecer
reason_for_recall,"Identification of a potential manufacturing defect on the internal surface of the NanoClave within specific lots of NanoClave sets, which may inhibit a proper seal with the NanoClave spike."
address_2,
product_quantity,Total of all products (Listed #1 thru 101) = 304735 units
code_info,Lot Number:4496182.
center_classification_date,20201022
distribution_pattern,"Worldwide Distribution: US (nationwide): AL, AZ, CA, CO, DE, FL,GA,ID, IL,IN, KY, LA, MA, MD, MI, MN, MO, MS, NC, NE, NH,NJ,NM,NV, NY, OH, OK, OR, PA,RI, TN, TX, UT, VA, WA, WI, and WV; and OUS countries of: Belgium, Brazil, Canada, France, Germany, Ireland, Japan, Martinique, Mexico, Poland, and Saudi Arabia."
state,CA


In [3]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 6904 entries, 0 to 6903
Data columns (total 25 columns):
 #   Column                      Non-Null Count  Dtype  
---  ------                      --------------  -----  
 0   country                     6904 non-null   object 
 1   city                        6904 non-null   object 
 2   address_1                   6904 non-null   object 
 3   reason_for_recall           6904 non-null   object 
 4   address_2                   465 non-null    object 
 5   product_quantity            6387 non-null   object 
 6   code_info                   6887 non-null   object 
 7   center_classification_date  6904 non-null   int64  
 8   distribution_pattern        6903 non-null   object 
 9   state                       6130 non-null   object 
 10  product_description         6903 non-null   object 
 11  report_date                 6904 non-null   int64  
 12  classification              6904 non-null   object 
 13  openfda                     6904 

In [4]:
# Verify recall_number is the unique key for enforcement report

df["recall_number"].nunique()  

6904

## We are interested in just two columns

- reason for recall
- classification

In [5]:
df[["reason_for_recall", "classification"]].sample(5)

Unnamed: 0,reason_for_recall,classification
2100,"Due to a problem in the packaging sealing process at the supplier of the affected devices, it cannot be guaranteed that the sterilized medical devices in the scope of this recall remain sterile during their shelf-life.",Class II
5249,Labeling update concerning potential leaks from the catheter or the start-up kit (SUK) tubing,Class II
4348,"An increase in the rate of complaints for difficulty or inability to track over the guidewire, may result in a procedural delay due to the need to exchange the affected device",Class II
6212,Cordis is recalling one (1) lot of POWERFLEX P3 PTA Dilatation Catheter due to the potential for body/shaft voids in the proximal seal area.,Class II
493,"Potential product mix-up. The recalling firm has determined that one unit of OsteoVation EX, 3cc that may not have been sterilized was inadvertently packaged as CranioSculpt C, 10cc.",Class II


In [6]:
df["classification"].unique()

array(['Class II', 'Class I', 'Class III'], dtype=object)

In [7]:
df_class_count = df[["classification"]].groupby("classification", as_index=False).value_counts()

df_class_count

Unnamed: 0,classification,count
0,Class I,473
1,Class II,6266
2,Class III,165


In [8]:
fig = px.pie(
    df_class_count, 
    names="classification", 
    values="count",
#    height=600,
    title="Medical Device Recall Classification",
    template="plotly_white"
)

fig.update_traces(textposition='outside', textinfo='percent+label+value')
fig.update_layout(showlegend=False)

fig.show()

# The End.