# Anti-Diabetes Medications Data Extraction and Refinement  

This sheet covers the data process on preparing data target of anti-diabetes medicine products.  
The process includes data extraction, data filtering and grouping, cleaning, and combination.

## Data Extraction from FDA USA Commerical Product Database
USA commerical product databsase can be fetched at https://www.accessdata.fda.gov/cder/ndcxls.zip

In [3]:
import requests
import zipfile
import pandas as pd

In [4]:
#NDC databse file from FDA (covers all current commerical USA )
url = "https://www.accessdata.fda.gov/cder/ndcxls.zip"

In [5]:
#Download folder of files
response = requests.get(url)
if response.status_code == 200:
    with open("ndcxls.zip", "wb") as file:
        file.write(response.content)
    print("Downloaded successfully.")
else:
    print("Failed to download. Status code:", response.status_code)

Downloaded successfully.


In [7]:
#Unzip folder
extract_dir = "extracted_files"

with zipfile.ZipFile("ndcxls.zip", 'r') as zip_ref:
    zip_ref.extractall(extract_dir)
    

In [8]:
#Load files 
product_filename = "product.xls"
package_filename = "package.xls"

#one challenge here is two xls file actually are in a form of txt and both of them do not support utf-8 coding 
product_df = pd.read_csv(f"{extract_dir}/{product_filename}",delimiter='\t',encoding='latin1')
package_df = pd.read_csv(f"{extract_dir}/{package_filename}",delimiter='\t',encoding='latin1')    



### product_df
product_df contains all current active USA pharmaceutical products, with attributes fields to indicate product information including dosage, molecule name(substance), router, trade label(Proprietary name) and etc.      
We are going to use the field of PHARM_CLASSES to filter in anti-diabetes products and productID to map package NDC with the file of package_df.

In [9]:
product_df.head(5)

Unnamed: 0,PRODUCTID,PRODUCTNDC,PRODUCTTYPENAME,PROPRIETARYNAME,PROPRIETARYNAMESUFFIX,NONPROPRIETARYNAME,DOSAGEFORMNAME,ROUTENAME,STARTMARKETINGDATE,ENDMARKETINGDATE,MARKETINGCATEGORYNAME,APPLICATIONNUMBER,LABELERNAME,SUBSTANCENAME,ACTIVE_NUMERATOR_STRENGTH,ACTIVE_INGRED_UNIT,PHARM_CLASSES,DEASCHEDULE,NDC_EXCLUDE_FLAG,LISTING_RECORD_CERTIFIED_THROUGH
0,0002-0213_458ef2aa-cd5f-48bc-8829-82420cfed33b,0002-0213,HUMAN OTC DRUG,Humulin,R,Insulin human,"INJECTION, SOLUTION",PARENTERAL,19830627,,BLA,BLA018780,Eli Lilly and Company,INSULIN HUMAN,100.0,[iU]/mL,"Insulin [CS], Insulin [EPC]",,N,20241231.0
1,0002-0800_dec32ead-837e-4331-ab55-f3bbccea5b38,0002-0800,HUMAN OTC DRUG,Sterile Diluent,,diluent,"INJECTION, SOLUTION",SUBCUTANEOUS,19870710,,BLA,BLA018781,Eli Lilly and Company,WATER,1.0,mL/mL,,,N,20241231.0
2,0002-1152_a820004b-8342-4e58-b733-2d97445b2f5e,0002-1152,HUMAN PRESCRIPTION DRUG,MOUNJARO,,tirzepatide,"INJECTION, SOLUTION",SUBCUTANEOUS,20230728,,NDA,NDA215866,Eli Lilly and Company,TIRZEPATIDE,2.5,mg/.5mL,"G-Protein-linked Receptor Interactions [MoA], ...",,N,20241231.0
3,0002-1200_7832a4ce-bb0e-4753-a72c-a5bc9621f08c,0002-1200,HUMAN PRESCRIPTION DRUG,Amyvid,,Florbetapir F 18,"INJECTION, SOLUTION",INTRAVENOUS,20120601,,NDA,NDA202008,Eli Lilly and Company,FLORBETAPIR F-18,51.0,mCi/mL,"Positron Emitting Activity [MoA], Radioactive ...",,N,20241231.0
4,0002-1210_d03b2693-0231-4df4-a037-63017a42e85a,0002-1210,HUMAN PRESCRIPTION DRUG,TAUVID,,Flortaucipir F-18,"INJECTION, SOLUTION",INTRAVENOUS,20200528,,NDA,NDA212123,Eli Lilly and Company,FLORTAUCIPIR F-18,51.0,mCi/mL,,,N,20231231.0


### package_df
Package_df contains product package NDC which is a 11-digit unique identify key. The key will be used to map for drug utilization file.

In [10]:
package_df.head(5)

Unnamed: 0,PRODUCTID,PRODUCTNDC,NDCPACKAGECODE,PACKAGEDESCRIPTION,STARTMARKETINGDATE,ENDMARKETINGDATE,NDC_EXCLUDE_FLAG,SAMPLE_PACKAGE
0,0002-0213_458ef2aa-cd5f-48bc-8829-82420cfed33b,0002-0213,0002-0213-01,"1 VIAL, MULTI-DOSE in 1 CARTON (0002-0213-01) ...",20230620,,N,N
1,0002-0800_dec32ead-837e-4331-ab55-f3bbccea5b38,0002-0800,0002-0800-01,1 VIAL in 1 CARTON (0002-0800-01) / 10 mL in ...,19870710,,N,N
2,0002-1152_a820004b-8342-4e58-b733-2d97445b2f5e,0002-1152,0002-1152-01,"1 VIAL, SINGLE-DOSE in 1 CARTON (0002-1152-01)...",20230728,,N,N
3,0002-1200_7832a4ce-bb0e-4753-a72c-a5bc9621f08c,0002-1200,0002-1200-48,"1 VIAL, MULTI-DOSE in 1 CAN (0002-1200-48) / ...",20230522,,N,N
4,0002-1200_7832a4ce-bb0e-4753-a72c-a5bc9621f08c,0002-1200,0002-1200-50,"1 VIAL, MULTI-DOSE in 1 CAN (0002-1200-50) / ...",20120601,,N,N


## Anti-diabetes medication group
Referenced by this web, we can categorize anti-diabetes medicines into 10 groups
https://www.healthline.com/health/diabetes/medications-list

### Insulin
Insulin is the most common type of medication used in type 1 diabetes treatment. 
There are more than 20 types sold in the United States.
The goal of treatment is to replace the insulin that your pancreas can’t make.  
Some people with type 2 diabetes may also need to take insulin. The same types of insulin used to treat type 1 diabetes can also treat type 2 diabetes.


In [12]:
Insulin=product_df[product_df['PHARM_CLASSES'].str.contains('Insulin', case=False, na=False)]
Insulin.head(5)

Unnamed: 0,PRODUCTID,PRODUCTNDC,PRODUCTTYPENAME,PROPRIETARYNAME,PROPRIETARYNAMESUFFIX,NONPROPRIETARYNAME,DOSAGEFORMNAME,ROUTENAME,STARTMARKETINGDATE,ENDMARKETINGDATE,MARKETINGCATEGORYNAME,APPLICATIONNUMBER,LABELERNAME,SUBSTANCENAME,ACTIVE_NUMERATOR_STRENGTH,ACTIVE_INGRED_UNIT,PHARM_CLASSES,DEASCHEDULE,NDC_EXCLUDE_FLAG,LISTING_RECORD_CERTIFIED_THROUGH
0,0002-0213_458ef2aa-cd5f-48bc-8829-82420cfed33b,0002-0213,HUMAN OTC DRUG,Humulin,R,Insulin human,"INJECTION, SOLUTION",PARENTERAL,19830627,,BLA,BLA018780,Eli Lilly and Company,INSULIN HUMAN,100.0,[iU]/mL,"Insulin [CS], Insulin [EPC]",,N,20241231.0
2,0002-1152_a820004b-8342-4e58-b733-2d97445b2f5e,0002-1152,HUMAN PRESCRIPTION DRUG,MOUNJARO,,tirzepatide,"INJECTION, SOLUTION",SUBCUTANEOUS,20230728,,NDA,NDA215866,Eli Lilly and Company,TIRZEPATIDE,2.5,mg/.5mL,"G-Protein-linked Receptor Interactions [MoA], ...",,N,20241231.0
6,0002-1243_a820004b-8342-4e58-b733-2d97445b2f5e,0002-1243,HUMAN PRESCRIPTION DRUG,MOUNJARO,,tirzepatide,"INJECTION, SOLUTION",SUBCUTANEOUS,20230728,,NDA,NDA215866,Eli Lilly and Company,TIRZEPATIDE,5.0,mg/.5mL,"G-Protein-linked Receptor Interactions [MoA], ...",,N,20241231.0
11,0002-1457_a820004b-8342-4e58-b733-2d97445b2f5e,0002-1457,HUMAN PRESCRIPTION DRUG,MOUNJARO,,tirzepatide,"INJECTION, SOLUTION",SUBCUTANEOUS,20220513,,NDA,NDA215866,Eli Lilly and Company,TIRZEPATIDE,15.0,mg/.5mL,"G-Protein-linked Receptor Interactions [MoA], ...",,N,20241231.0
12,0002-1460_a820004b-8342-4e58-b733-2d97445b2f5e,0002-1460,HUMAN PRESCRIPTION DRUG,MOUNJARO,,tirzepatide,"INJECTION, SOLUTION",SUBCUTANEOUS,20220513,,NDA,NDA215866,Eli Lilly and Company,TIRZEPATIDE,12.5,mg/.5mL,"G-Protein-linked Receptor Interactions [MoA], ...",,N,20241231.0


### Alpha-glucosidase inhibitors
These medications help your body break down starchy foods and table sugar. This effect lowers your blood sugar levels.
When taken as prescribed, these medications won’t cause hypoglycemia (low blood sugar). However, your risk of hypoglycemia may be greater if you take them with other types of diabetes medications.

In [18]:
Alpha_G_INHB=product_df[product_df['PHARM_CLASSES'].str.contains('alpha Glucosidase Inhibitors', case=False, na=False)]
Alpha_G_INHB.head(5)

Unnamed: 0,PRODUCTID,PRODUCTNDC,PRODUCTTYPENAME,PROPRIETARYNAME,PROPRIETARYNAMESUFFIX,NONPROPRIETARYNAME,DOSAGEFORMNAME,ROUTENAME,STARTMARKETINGDATE,ENDMARKETINGDATE,MARKETINGCATEGORYNAME,APPLICATIONNUMBER,LABELERNAME,SUBSTANCENAME,ACTIVE_NUMERATOR_STRENGTH,ACTIVE_INGRED_UNIT,PHARM_CLASSES,DEASCHEDULE,NDC_EXCLUDE_FLAG,LISTING_RECORD_CERTIFIED_THROUGH
794,0054-0140_68881739-6edb-4c67-be06-14953f60a146,0054-0140,HUMAN PRESCRIPTION DRUG,Acarbose,,Acarbose,TABLET,ORAL,20080507,,ANDA,ANDA078470,Hikma Pharmaceuticals USA Inc.,ACARBOSE,25,mg/1,"alpha Glucosidase Inhibitors [MoA], alpha-Gluc...",,N,20241231.0
795,0054-0141_68881739-6edb-4c67-be06-14953f60a146,0054-0141,HUMAN PRESCRIPTION DRUG,Acarbose,,Acarbose,TABLET,ORAL,20080507,,ANDA,ANDA078470,Hikma Pharmaceuticals USA Inc.,ACARBOSE,50,mg/1,"alpha Glucosidase Inhibitors [MoA], alpha-Gluc...",,N,20241231.0
796,0054-0142_68881739-6edb-4c67-be06-14953f60a146,0054-0142,HUMAN PRESCRIPTION DRUG,Acarbose,,Acarbose,TABLET,ORAL,20080507,,ANDA,ANDA078470,Hikma Pharmaceuticals USA Inc.,ACARBOSE,100,mg/1,"alpha Glucosidase Inhibitors [MoA], alpha-Gluc...",,N,20241231.0
19263,23155-147_84ff1519-f0a6-4d8f-afd3-bce6f78813aa,23155-147,HUMAN PRESCRIPTION DRUG,Acarbose,,Acarbose,TABLET,ORAL,20210416,,ANDA,ANDA202271,Heritage Pharmaceuticals Inc. d/b/a Avet Pharm...,ACARBOSE,25,mg/1,"alpha Glucosidase Inhibitors [MoA], alpha-Gluc...",,N,20241231.0
19264,23155-148_84ff1519-f0a6-4d8f-afd3-bce6f78813aa,23155-148,HUMAN PRESCRIPTION DRUG,Acarbose,,Acarbose,TABLET,ORAL,20210416,,ANDA,ANDA202271,Heritage Pharmaceuticals Inc. d/b/a Avet Pharm...,ACARBOSE,50,mg/1,"alpha Glucosidase Inhibitors [MoA], alpha-Gluc...",,N,20241231.0


### Thiazolidinediones
Thiazolidinediones work by decreasing glucose in your liver. They also help your fat cells use insulin better by targeting insulin resistanceTrusted Source.

In [14]:
Thiazolidinedione= product_df[product_df['PHARM_CLASSES'].str.contains('Thiazolidinedione', case=False, na=False)]
Thiazolidinedione.head(5)

Unnamed: 0,PRODUCTID,PRODUCTNDC,PRODUCTTYPENAME,PROPRIETARYNAME,PROPRIETARYNAMESUFFIX,NONPROPRIETARYNAME,DOSAGEFORMNAME,ROUTENAME,STARTMARKETINGDATE,ENDMARKETINGDATE,MARKETINGCATEGORYNAME,APPLICATIONNUMBER,LABELERNAME,SUBSTANCENAME,ACTIVE_NUMERATOR_STRENGTH,ACTIVE_INGRED_UNIT,PHARM_CLASSES,DEASCHEDULE,NDC_EXCLUDE_FLAG,LISTING_RECORD_CERTIFIED_THROUGH
1928,0093-7271_b5d757a4-7f61-4f9d-bd2a-0ebf9892398d,0093-7271,HUMAN PRESCRIPTION DRUG,Pioglitazone,,Pioglitazone,TABLET,ORAL,20150204,,ANDA,ANDA077210,"Teva Pharmaceuticals USA, Inc.",PIOGLITAZONE HYDROCHLORIDE,15,mg/1,"PPAR alpha [CS], PPAR gamma [CS], Peroxisome P...",,N,20231231.0
1929,0093-7272_b5d757a4-7f61-4f9d-bd2a-0ebf9892398d,0093-7272,HUMAN PRESCRIPTION DRUG,Pioglitazone,,Pioglitazone,TABLET,ORAL,20150204,,ANDA,ANDA077210,"Teva Pharmaceuticals USA, Inc.",PIOGLITAZONE HYDROCHLORIDE,30,mg/1,"PPAR alpha [CS], PPAR gamma [CS], Peroxisome P...",,N,20231231.0
1930,0093-7273_b5d757a4-7f61-4f9d-bd2a-0ebf9892398d,0093-7273,HUMAN PRESCRIPTION DRUG,Pioglitazone,,Pioglitazone,TABLET,ORAL,20150522,,ANDA,ANDA077210,"Teva Pharmaceuticals USA, Inc.",PIOGLITAZONE HYDROCHLORIDE,45,mg/1,"PPAR alpha [CS], PPAR gamma [CS], Peroxisome P...",,N,20231231.0
10567,0781-5420_114a7af7-cc48-4535-8b56-589941e5f521,0781-5420,HUMAN PRESCRIPTION DRUG,Pioglitazone,,pioglitazone,TABLET,ORAL,20130301,,ANDA,ANDA078670,Sandoz Inc,PIOGLITAZONE HYDROCHLORIDE,15,mg/1,"PPAR alpha [CS], PPAR gamma [CS], Peroxisome P...",,N,20231231.0
10568,0781-5421_114a7af7-cc48-4535-8b56-589941e5f521,0781-5421,HUMAN PRESCRIPTION DRUG,Pioglitazone,,pioglitazone,TABLET,ORAL,20130301,,ANDA,ANDA078670,Sandoz Inc,PIOGLITAZONE HYDROCHLORIDE,30,mg/1,"PPAR alpha [CS], PPAR gamma [CS], Peroxisome P...",,N,20231231.0


### Biguanides
Biguanides decrease how much glucose your liver makes. They also decrease how much glucose your intestines absorb, help your muscles absorb glucose, and make your body more sensitive to insulin.

In [17]:
Biguanides=product_df[product_df['PHARM_CLASSES'].str.contains('Biguanide', case=False, na=False)]
Biguanides.head(5)

Unnamed: 0,PRODUCTID,PRODUCTNDC,PRODUCTTYPENAME,PROPRIETARYNAME,PROPRIETARYNAMESUFFIX,NONPROPRIETARYNAME,DOSAGEFORMNAME,ROUTENAME,STARTMARKETINGDATE,ENDMARKETINGDATE,MARKETINGCATEGORYNAME,APPLICATIONNUMBER,LABELERNAME,SUBSTANCENAME,ACTIVE_NUMERATOR_STRENGTH,ACTIVE_INGRED_UNIT,PHARM_CLASSES,DEASCHEDULE,NDC_EXCLUDE_FLAG,LISTING_RECORD_CERTIFIED_THROUGH
179,0006-0078_40b87f24-8005-4ee7-b162-94690443fdb0,0006-0078,HUMAN PRESCRIPTION DRUG,JANUMET,XR,sitagliptin and metformin hydrochloride,"TABLET, FILM COATED, EXTENDED RELEASE",ORAL,20120202,,NDA,NDA202270,Merck Sharp & Dohme LLC,METFORMIN HYDROCHLORIDE; SITAGLIPTIN PHOSPHATE,500; 50,mg/1; mg/1,"Biguanide [EPC], Biguanides [CS], Dipeptidyl P...",,N,20241231.0
180,0006-0080_40b87f24-8005-4ee7-b162-94690443fdb0,0006-0080,HUMAN PRESCRIPTION DRUG,JANUMET,XR,sitagliptin and metformin hydrochloride,"TABLET, FILM COATED, EXTENDED RELEASE",ORAL,20120202,,NDA,NDA202270,Merck Sharp & Dohme LLC,METFORMIN HYDROCHLORIDE; SITAGLIPTIN PHOSPHATE,1000; 50,mg/1; mg/1,"Biguanide [EPC], Biguanides [CS], Dipeptidyl P...",,N,20241231.0
181,0006-0081_40b87f24-8005-4ee7-b162-94690443fdb0,0006-0081,HUMAN PRESCRIPTION DRUG,JANUMET,XR,sitagliptin and metformin hydrochloride,"TABLET, FILM COATED, EXTENDED RELEASE",ORAL,20120202,,NDA,NDA202270,Merck Sharp & Dohme LLC,METFORMIN HYDROCHLORIDE; SITAGLIPTIN PHOSPHATE,1000; 100,mg/1; mg/1,"Biguanide [EPC], Biguanides [CS], Dipeptidyl P...",,N,20241231.0
194,0006-0575_57ef5ead-02e0-417d-b21f-8e41ff34b53f,0006-0575,HUMAN PRESCRIPTION DRUG,JANUMET,,SITAGLIPTIN and METFORMIN HYDROCHLORIDE,"TABLET, FILM COATED",ORAL,20070330,,NDA,NDA022044,Merck Sharp & Dohme LLC,METFORMIN HYDROCHLORIDE; SITAGLIPTIN PHOSPHATE,500; 50,mg/1; mg/1,"Biguanide [EPC], Biguanides [CS], Dipeptidyl P...",,N,20241231.0
195,0006-0577_57ef5ead-02e0-417d-b21f-8e41ff34b53f,0006-0577,HUMAN PRESCRIPTION DRUG,JANUMET,,SITAGLIPTIN and METFORMIN HYDROCHLORIDE,"TABLET, FILM COATED",ORAL,20070330,,NDA,NDA022044,Merck Sharp & Dohme LLC,METFORMIN HYDROCHLORIDE; SITAGLIPTIN PHOSPHATE,1000; 50,mg/1; mg/1,"Biguanide [EPC], Biguanides [CS], Dipeptidyl P...",,N,20241231.0


### Sulfonylureas
These are among the oldestTrusted Source diabetes drugs still used today. They work by stimulating the pancreas with the help of beta cells. This causes your body to make more insulin.

In [20]:
Sulfonylurea=product_df[product_df['PHARM_CLASSES'].str.contains('Sulfonylurea', case=False, na=False)]
Sulfonylurea.head(5)

Unnamed: 0,PRODUCTID,PRODUCTNDC,PRODUCTTYPENAME,PROPRIETARYNAME,PROPRIETARYNAMESUFFIX,NONPROPRIETARYNAME,DOSAGEFORMNAME,ROUTENAME,STARTMARKETINGDATE,ENDMARKETINGDATE,MARKETINGCATEGORYNAME,APPLICATIONNUMBER,LABELERNAME,SUBSTANCENAME,ACTIVE_NUMERATOR_STRENGTH,ACTIVE_INGRED_UNIT,PHARM_CLASSES,DEASCHEDULE,NDC_EXCLUDE_FLAG,LISTING_RECORD_CERTIFIED_THROUGH
327,0009-0341_555c58db-4984-40d4-9806-86655fade006,0009-0341,HUMAN PRESCRIPTION DRUG,Glynase,,glyburide,TABLET,ORAL,19920304,,NDA,NDA020051,Pharmacia & Upjohn Company LLC,GLYBURIDE,1.5,mg/1,"Sulfonylurea Compounds [CS], Sulfonylurea [EPC]",,N,20241231.0
329,0009-0352_555c58db-4984-40d4-9806-86655fade006,0009-0352,HUMAN PRESCRIPTION DRUG,Glynase,,glyburide,TABLET,ORAL,19920304,,NDA,NDA020051,Pharmacia & Upjohn Company LLC,GLYBURIDE,3.0,mg/1,"Sulfonylurea Compounds [CS], Sulfonylurea [EPC]",,N,20241231.0
356,0009-3449_555c58db-4984-40d4-9806-86655fade006,0009-3449,HUMAN PRESCRIPTION DRUG,Glynase,,glyburide,TABLET,ORAL,19920304,,NDA,NDA020051,Pharmacia & Upjohn Company LLC,GLYBURIDE,6.0,mg/1,"Sulfonylurea Compounds [CS], Sulfonylurea [EPC]",,N,20241231.0
671,0049-0170_1fdfc3b7-b578-4c68-896e-b8cf38d10fa1,0049-0170,HUMAN PRESCRIPTION DRUG,Glucotrol,XL,glipizide,"TABLET, EXTENDED RELEASE",ORAL,20130715,,NDA,NDA020329,Roerig,GLIPIZIDE,2.5,mg/1,"Sulfonylurea Compounds [CS], Sulfonylurea [EPC]",,N,20241231.0
672,0049-0174_1fdfc3b7-b578-4c68-896e-b8cf38d10fa1,0049-0174,HUMAN PRESCRIPTION DRUG,Glucotrol,XL,glipizide,"TABLET, EXTENDED RELEASE",ORAL,20130509,,NDA,NDA020329,Roerig,GLIPIZIDE,5.0,mg/1,"Sulfonylurea Compounds [CS], Sulfonylurea [EPC]",,N,20241231.0


### Dopamine-2 agonist
It’s unknown exactly how this drug treats type 2 diabetes. It may affect rhythms in your body and prevent insulin resistance. According to one 2015 reviewTrusted Source, dopamine-2 agonists may also improve other related health concerns, such as high cholesterol or weight management.


In [22]:
DA_2A=product_df[product_df['SUBSTANCENAME'].str.contains('BROMOCRIPTINE MESYLATE', case=False, na=False)]
DA_2A.head(5)

Unnamed: 0,PRODUCTID,PRODUCTNDC,PRODUCTTYPENAME,PROPRIETARYNAME,PROPRIETARYNAMESUFFIX,NONPROPRIETARYNAME,DOSAGEFORMNAME,ROUTENAME,STARTMARKETINGDATE,ENDMARKETINGDATE,MARKETINGCATEGORYNAME,APPLICATIONNUMBER,LABELERNAME,SUBSTANCENAME,ACTIVE_NUMERATOR_STRENGTH,ACTIVE_INGRED_UNIT,PHARM_CLASSES,DEASCHEDULE,NDC_EXCLUDE_FLAG,LISTING_RECORD_CERTIFIED_THROUGH
7323,0378-7096_caaa8c54-ee5c-47a9-8455-1f49fab45143,0378-7096,HUMAN PRESCRIPTION DRUG,Bromocriptine Mesylate,,bromocriptine mesylate,CAPSULE,ORAL,20130617,,ANDA,ANDA077226,Mylan Pharmaceuticals Inc.,BROMOCRIPTINE MESYLATE,5.0,mg/1,"Ergolines [CS], Ergot Derivative [EPC]",,N,20231231.0
9314,0574-0106_4132274e-3abb-49b4-8d1b-3400a1df9daf,0574-0106,HUMAN PRESCRIPTION DRUG,Bromocriptine mesylate,,Bromocriptine mesylate,TABLET,ORAL,20140401,,ANDA,ANDA077646,Padagis US LLC,BROMOCRIPTINE MESYLATE,2.5,mg/1,"Ergolines [CS], Ergot Derivative [EPC]",,N,20231231.0
10563,0781-5325_ed58abc4-1571-46be-8190-29dcaed68fef,0781-5325,HUMAN PRESCRIPTION DRUG,Bromocriptine mesylate,,Bromocriptine mesylate,TABLET,ORAL,19980113,,ANDA,ANDA074631,Sandoz Inc,BROMOCRIPTINE MESYLATE,2.5,mg/1,"Ergolines [CS], Ergot Derivative [EPC]",,N,20231231.0
21384,30698-201_e66711ca-726a-4a85-b2fa-d5ccb85e1221,30698-201,HUMAN PRESCRIPTION DRUG,Parlodel,,bromocriptine mesylate,"CAPSULE, GELATIN COATED",ORAL,20140428,,NDA,NDA017962,Validus Pharmaceuticals LLC,BROMOCRIPTINE MESYLATE,5.0,mg/1,"Ergolines [CS], Ergot Derivative [EPC]",,N,20241231.0
21385,30698-202_e66711ca-726a-4a85-b2fa-d5ccb85e1221,30698-202,HUMAN PRESCRIPTION DRUG,Parlodel,,bromocriptine mesylate,TABLET,ORAL,20140428,,NDA,NDA017962,Validus Pharmaceuticals LLC,BROMOCRIPTINE MESYLATE,2.5,mg/1,"Ergolines [CS], Ergot Derivative [EPC]",,N,20241231.0


### Sodium-glucose transporter (SGLT) 2 inhibitors
Sodium-glucose transporter (SGLT) 2 inhibitors work by preventing the kidneys from holding on to glucose. Instead, your body gets rid of the glucose through your urine.

In [23]:
SGLT_2_INHB=product_df[product_df['PHARM_CLASSES'].str.contains('Sodium-Glucose Transporter 2 Inhibitors', case=False, na=False)]
SGLT_2_INHB.head(5)

Unnamed: 0,PRODUCTID,PRODUCTNDC,PRODUCTTYPENAME,PROPRIETARYNAME,PROPRIETARYNAMESUFFIX,NONPROPRIETARYNAME,DOSAGEFORMNAME,ROUTENAME,STARTMARKETINGDATE,ENDMARKETINGDATE,MARKETINGCATEGORYNAME,APPLICATIONNUMBER,LABELERNAME,SUBSTANCENAME,ACTIVE_NUMERATOR_STRENGTH,ACTIVE_INGRED_UNIT,PHARM_CLASSES,DEASCHEDULE,NDC_EXCLUDE_FLAG,LISTING_RECORD_CERTIFIED_THROUGH
5545,0310-6205_129852e5-b1f1-488e-99fe-aa8d0d4fd07a,0310-6205,HUMAN PRESCRIPTION DRUG,FARXIGA,,DAPAGLIFLOZIN,"TABLET, FILM COATED",ORAL,20080114,,NDA,NDA202293,AstraZeneca Pharmaceuticals LP,DAPAGLIFLOZIN PROPANEDIOL,5,mg/1,Sodium-Glucose Cotransporter 2 Inhibitor [EPC]...,,N,20241231.0
5546,0310-6210_129852e5-b1f1-488e-99fe-aa8d0d4fd07a,0310-6210,HUMAN PRESCRIPTION DRUG,FARXIGA,,DAPAGLIFLOZIN,"TABLET, FILM COATED",ORAL,20080114,,NDA,NDA202293,AstraZeneca Pharmaceuticals LP,DAPAGLIFLOZIN PROPANEDIOL,10,mg/1,Sodium-Glucose Cotransporter 2 Inhibitor [EPC]...,,N,20241231.0
5547,0310-6225_7d19a7d8-7419-4bf1-8462-52894fe73902,0310-6225,HUMAN PRESCRIPTION DRUG,XIGDUO,XR,dapagliflozin and metformin hydrochloride,"TABLET, FILM COATED, EXTENDED RELEASE",ORAL,20141029,,NDA,NDA205649,AstraZeneca Pharmaceuticals LP,DAPAGLIFLOZIN PROPANEDIOL; METFORMIN HYDROCHLO...,2.5; 1000,mg/1; mg/1,"Biguanide [EPC], Biguanides [CS], Sodium-Gluco...",,N,20241231.0
5548,0310-6250_7d19a7d8-7419-4bf1-8462-52894fe73902,0310-6250,HUMAN PRESCRIPTION DRUG,XIGDUO,XR,dapagliflozin and metformin hydrochloride,"TABLET, FILM COATED, EXTENDED RELEASE",ORAL,20141029,,NDA,NDA205649,AstraZeneca Pharmaceuticals LP,DAPAGLIFLOZIN PROPANEDIOL; METFORMIN HYDROCHLO...,5; 500,mg/1; mg/1,"Biguanide [EPC], Biguanides [CS], Sodium-Gluco...",,N,20241231.0
5549,0310-6260_7d19a7d8-7419-4bf1-8462-52894fe73902,0310-6260,HUMAN PRESCRIPTION DRUG,XIGDUO,XR,dapagliflozin and metformin hydrochloride,"TABLET, FILM COATED, EXTENDED RELEASE",ORAL,20141029,,NDA,NDA205649,AstraZeneca Pharmaceuticals LP,DAPAGLIFLOZIN PROPANEDIOL; METFORMIN HYDROCHLO...,5; 1000,mg/1; mg/1,"Biguanide [EPC], Biguanides [CS], Sodium-Gluco...",,N,20241231.0


### Glucagon-like peptide-1 receptor agonists (GLP-1 receptor agonists)
GLP-1 receptor agonists are similar to incretin and may be prescribed in addition to a diet and exercise plan to help promote better glycemic control.

They increase how much insulin your body uses and the growth of pancreatic beta cells. They decrease your appetite and how much glucagon your body uses. They also slow stomach emptying, which may maximize nutrient absorption from the foods you eat while potentially helpingTrusted Source you maintain or lose weight.   

This the medicine group Elon is taking.

In [25]:
GLP_1_RA=product_df[product_df['PHARM_CLASSES'].str.contains('GLP-1 Receptor Agonist', case=False, na=False)]
GLP_1_RA.head(5)

Unnamed: 0,PRODUCTID,PRODUCTNDC,PRODUCTTYPENAME,PROPRIETARYNAME,PROPRIETARYNAMESUFFIX,NONPROPRIETARYNAME,DOSAGEFORMNAME,ROUTENAME,STARTMARKETINGDATE,ENDMARKETINGDATE,MARKETINGCATEGORYNAME,APPLICATIONNUMBER,LABELERNAME,SUBSTANCENAME,ACTIVE_NUMERATOR_STRENGTH,ACTIVE_INGRED_UNIT,PHARM_CLASSES,DEASCHEDULE,NDC_EXCLUDE_FLAG,LISTING_RECORD_CERTIFIED_THROUGH
2,0002-1152_a820004b-8342-4e58-b733-2d97445b2f5e,0002-1152,HUMAN PRESCRIPTION DRUG,MOUNJARO,,tirzepatide,"INJECTION, SOLUTION",SUBCUTANEOUS,20230728,,NDA,NDA215866,Eli Lilly and Company,TIRZEPATIDE,2.5,mg/.5mL,"G-Protein-linked Receptor Interactions [MoA], ...",,N,20241231.0
6,0002-1243_a820004b-8342-4e58-b733-2d97445b2f5e,0002-1243,HUMAN PRESCRIPTION DRUG,MOUNJARO,,tirzepatide,"INJECTION, SOLUTION",SUBCUTANEOUS,20230728,,NDA,NDA215866,Eli Lilly and Company,TIRZEPATIDE,5.0,mg/.5mL,"G-Protein-linked Receptor Interactions [MoA], ...",,N,20241231.0
7,0002-1433_e7c6d8de-a271-493d-8a2a-c9c08e8f92b8,0002-1433,HUMAN PRESCRIPTION DRUG,Trulicity,,Dulaglutide,"INJECTION, SOLUTION",SUBCUTANEOUS,20140918,,BLA,BLA125469,Eli Lilly and Company,DULAGLUTIDE,0.75,mg/.5mL,"GLP-1 Receptor Agonist [EPC], Glucagon-Like Pe...",,N,20241231.0
8,0002-1434_e7c6d8de-a271-493d-8a2a-c9c08e8f92b8,0002-1434,HUMAN PRESCRIPTION DRUG,Trulicity,,Dulaglutide,"INJECTION, SOLUTION",SUBCUTANEOUS,20140918,,BLA,BLA125469,Eli Lilly and Company,DULAGLUTIDE,1.5,mg/.5mL,"GLP-1 Receptor Agonist [EPC], Glucagon-Like Pe...",,N,20241231.0
11,0002-1457_a820004b-8342-4e58-b733-2d97445b2f5e,0002-1457,HUMAN PRESCRIPTION DRUG,MOUNJARO,,tirzepatide,"INJECTION, SOLUTION",SUBCUTANEOUS,20220513,,NDA,NDA215866,Eli Lilly and Company,TIRZEPATIDE,15.0,mg/.5mL,"G-Protein-linked Receptor Interactions [MoA], ...",,N,20241231.0


### Glinide
These medications help your body release insulin. However, they aren’t for everyone. In some cases, they may lower your blood sugar too much, especially if you haveTrusted Source advanced kidney disease.

In [26]:
Glinide=product_df[product_df['PHARM_CLASSES'].str.contains('Glinide', case=False, na=False)]
Glinide.head(5)

Unnamed: 0,PRODUCTID,PRODUCTNDC,PRODUCTTYPENAME,PROPRIETARYNAME,PROPRIETARYNAMESUFFIX,NONPROPRIETARYNAME,DOSAGEFORMNAME,ROUTENAME,STARTMARKETINGDATE,ENDMARKETINGDATE,MARKETINGCATEGORYNAME,APPLICATIONNUMBER,LABELERNAME,SUBSTANCENAME,ACTIVE_NUMERATOR_STRENGTH,ACTIVE_INGRED_UNIT,PHARM_CLASSES,DEASCHEDULE,NDC_EXCLUDE_FLAG,LISTING_RECORD_CERTIFIED_THROUGH
9329,0574-0240_d49adc4a-7bb7-4794-8355-210325b08ff5,0574-0240,HUMAN PRESCRIPTION DRUG,Repaglinide,,REPAGLINIDE,TABLET,ORAL,20130809,,ANDA,ANDA201189,Padagis US LLC,REPAGLINIDE,0.5,mg/1,"Glinide [EPC], Potassium Channel Antagonists [...",,N,20241231.0
9330,0574-0241_d49adc4a-7bb7-4794-8355-210325b08ff5,0574-0241,HUMAN PRESCRIPTION DRUG,Repaglinide,,REPAGLINIDE,TABLET,ORAL,20140122,20250131.0,ANDA,ANDA201189,Padagis US LLC,REPAGLINIDE,1.0,mg/1,"Glinide [EPC], Potassium Channel Antagonists [...",,N,
9331,0574-0242_d49adc4a-7bb7-4794-8355-210325b08ff5,0574-0242,HUMAN PRESCRIPTION DRUG,Repaglinide,,REPAGLINIDE,TABLET,ORAL,20140122,20240531.0,ANDA,ANDA201189,Padagis US LLC,REPAGLINIDE,2.0,mg/1,"Glinide [EPC], Potassium Channel Antagonists [...",,N,
9511,0591-3354_573675b0-8d66-4d58-85f2-428dda7c166c,0591-3354,HUMAN PRESCRIPTION DRUG,Nateglinide,,Nateglinide,TABLET,ORAL,20110518,,ANDA,ANDA077462,"Actavis Pharma, Inc.",NATEGLINIDE,60.0,mg/1,"Glinide [EPC], Potassium Channel Antagonists [...",,N,20241231.0
9512,0591-3355_573675b0-8d66-4d58-85f2-428dda7c166c,0591-3355,HUMAN PRESCRIPTION DRUG,Nateglinide,,Nateglinide,TABLET,ORAL,20110518,,ANDA,ANDA077462,"Actavis Pharma, Inc.",NATEGLINIDE,120.0,mg/1,"Glinide [EPC], Potassium Channel Antagonists [...",,N,20241231.0


### Dipeptidyl peptidase-4 (DPP-4) inhibitors
DPP-4 inhibitors are used to help reduce blood sugar without causing hypoglycemia.

DPP-4 inhibitors blockTrusted Source the DPP-4 enzyme. This enzyme destroys a hormone called incretin, which normally helps your body produce insulin when it’s needed. Incretins also decrease glucose output from the liver when your body doesn’t need it.

In [27]:
DDP_4_INHB=product_df[product_df['PHARM_CLASSES'].str.contains('Dipeptidyl Peptidase 4 Inhibitor', case=False, na=False)]
DDP_4_INHB.head(5)

Unnamed: 0,PRODUCTID,PRODUCTNDC,PRODUCTTYPENAME,PROPRIETARYNAME,PROPRIETARYNAMESUFFIX,NONPROPRIETARYNAME,DOSAGEFORMNAME,ROUTENAME,STARTMARKETINGDATE,ENDMARKETINGDATE,MARKETINGCATEGORYNAME,APPLICATIONNUMBER,LABELERNAME,SUBSTANCENAME,ACTIVE_NUMERATOR_STRENGTH,ACTIVE_INGRED_UNIT,PHARM_CLASSES,DEASCHEDULE,NDC_EXCLUDE_FLAG,LISTING_RECORD_CERTIFIED_THROUGH
179,0006-0078_40b87f24-8005-4ee7-b162-94690443fdb0,0006-0078,HUMAN PRESCRIPTION DRUG,JANUMET,XR,sitagliptin and metformin hydrochloride,"TABLET, FILM COATED, EXTENDED RELEASE",ORAL,20120202,,NDA,NDA202270,Merck Sharp & Dohme LLC,METFORMIN HYDROCHLORIDE; SITAGLIPTIN PHOSPHATE,500; 50,mg/1; mg/1,"Biguanide [EPC], Biguanides [CS], Dipeptidyl P...",,N,20241231.0
180,0006-0080_40b87f24-8005-4ee7-b162-94690443fdb0,0006-0080,HUMAN PRESCRIPTION DRUG,JANUMET,XR,sitagliptin and metformin hydrochloride,"TABLET, FILM COATED, EXTENDED RELEASE",ORAL,20120202,,NDA,NDA202270,Merck Sharp & Dohme LLC,METFORMIN HYDROCHLORIDE; SITAGLIPTIN PHOSPHATE,1000; 50,mg/1; mg/1,"Biguanide [EPC], Biguanides [CS], Dipeptidyl P...",,N,20241231.0
181,0006-0081_40b87f24-8005-4ee7-b162-94690443fdb0,0006-0081,HUMAN PRESCRIPTION DRUG,JANUMET,XR,sitagliptin and metformin hydrochloride,"TABLET, FILM COATED, EXTENDED RELEASE",ORAL,20120202,,NDA,NDA202270,Merck Sharp & Dohme LLC,METFORMIN HYDROCHLORIDE; SITAGLIPTIN PHOSPHATE,1000; 100,mg/1; mg/1,"Biguanide [EPC], Biguanides [CS], Dipeptidyl P...",,N,20241231.0
182,0006-0112_6970c5e9-bd40-4847-9321-ab5826b7b7e5,0006-0112,HUMAN PRESCRIPTION DRUG,JANUVIA,,sitagliptin,"TABLET, FILM COATED",ORAL,20061016,,NDA,NDA021995,Merck Sharp & Dohme LLC,SITAGLIPTIN PHOSPHATE,50,mg/1,"Dipeptidyl Peptidase 4 Inhibitor [EPC], Dipept...",,N,20241231.0
183,0006-0221_6970c5e9-bd40-4847-9321-ab5826b7b7e5,0006-0221,HUMAN PRESCRIPTION DRUG,JANUVIA,,sitagliptin,"TABLET, FILM COATED",ORAL,20061016,,NDA,NDA021995,Merck Sharp & Dohme LLC,SITAGLIPTIN PHOSPHATE,25,mg/1,"Dipeptidyl Peptidase 4 Inhibitor [EPC], Dipept...",,N,20241231.0


### Group them together into one file


In [29]:
grp = [Insulin, Thiazolidinedione, Sulfonylurea,SGLT_2_INHB,Glinide,GLP_1_RA,DDP_4_INHB,DA_2A,Biguanides,Alpha_G_INHB] 

In [30]:
DB_Grp = pd.concat(grp, ignore_index=True).drop_duplicates()

In [31]:
DB_Grp.head(5)

Unnamed: 0,PRODUCTID,PRODUCTNDC,PRODUCTTYPENAME,PROPRIETARYNAME,PROPRIETARYNAMESUFFIX,NONPROPRIETARYNAME,DOSAGEFORMNAME,ROUTENAME,STARTMARKETINGDATE,ENDMARKETINGDATE,MARKETINGCATEGORYNAME,APPLICATIONNUMBER,LABELERNAME,SUBSTANCENAME,ACTIVE_NUMERATOR_STRENGTH,ACTIVE_INGRED_UNIT,PHARM_CLASSES,DEASCHEDULE,NDC_EXCLUDE_FLAG,LISTING_RECORD_CERTIFIED_THROUGH
0,0002-0213_458ef2aa-cd5f-48bc-8829-82420cfed33b,0002-0213,HUMAN OTC DRUG,Humulin,R,Insulin human,"INJECTION, SOLUTION",PARENTERAL,19830627,,BLA,BLA018780,Eli Lilly and Company,INSULIN HUMAN,100.0,[iU]/mL,"Insulin [CS], Insulin [EPC]",,N,20241231.0
1,0002-1152_a820004b-8342-4e58-b733-2d97445b2f5e,0002-1152,HUMAN PRESCRIPTION DRUG,MOUNJARO,,tirzepatide,"INJECTION, SOLUTION",SUBCUTANEOUS,20230728,,NDA,NDA215866,Eli Lilly and Company,TIRZEPATIDE,2.5,mg/.5mL,"G-Protein-linked Receptor Interactions [MoA], ...",,N,20241231.0
2,0002-1243_a820004b-8342-4e58-b733-2d97445b2f5e,0002-1243,HUMAN PRESCRIPTION DRUG,MOUNJARO,,tirzepatide,"INJECTION, SOLUTION",SUBCUTANEOUS,20230728,,NDA,NDA215866,Eli Lilly and Company,TIRZEPATIDE,5.0,mg/.5mL,"G-Protein-linked Receptor Interactions [MoA], ...",,N,20241231.0
3,0002-1457_a820004b-8342-4e58-b733-2d97445b2f5e,0002-1457,HUMAN PRESCRIPTION DRUG,MOUNJARO,,tirzepatide,"INJECTION, SOLUTION",SUBCUTANEOUS,20220513,,NDA,NDA215866,Eli Lilly and Company,TIRZEPATIDE,15.0,mg/.5mL,"G-Protein-linked Receptor Interactions [MoA], ...",,N,20241231.0
4,0002-1460_a820004b-8342-4e58-b733-2d97445b2f5e,0002-1460,HUMAN PRESCRIPTION DRUG,MOUNJARO,,tirzepatide,"INJECTION, SOLUTION",SUBCUTANEOUS,20220513,,NDA,NDA215866,Eli Lilly and Company,TIRZEPATIDE,12.5,mg/.5mL,"G-Protein-linked Receptor Interactions [MoA], ...",,N,20241231.0


### Attach Product Package NDC from package_df
We alos add the field of package description in case the subsequent analysis may need.

In [32]:
DB_Grp_NDC=pd.merge(DB_Grp, package_df[['PRODUCTID', 'NDCPACKAGECODE','PACKAGEDESCRIPTION']], on='PRODUCTID', how='left')


In [33]:
DB_Grp_NDC.head(5)

Unnamed: 0,PRODUCTID,PRODUCTNDC,PRODUCTTYPENAME,PROPRIETARYNAME,PROPRIETARYNAMESUFFIX,NONPROPRIETARYNAME,DOSAGEFORMNAME,ROUTENAME,STARTMARKETINGDATE,ENDMARKETINGDATE,...,LABELERNAME,SUBSTANCENAME,ACTIVE_NUMERATOR_STRENGTH,ACTIVE_INGRED_UNIT,PHARM_CLASSES,DEASCHEDULE,NDC_EXCLUDE_FLAG,LISTING_RECORD_CERTIFIED_THROUGH,NDCPACKAGECODE,PACKAGEDESCRIPTION
0,0002-0213_458ef2aa-cd5f-48bc-8829-82420cfed33b,0002-0213,HUMAN OTC DRUG,Humulin,R,Insulin human,"INJECTION, SOLUTION",PARENTERAL,19830627,,...,Eli Lilly and Company,INSULIN HUMAN,100.0,[iU]/mL,"Insulin [CS], Insulin [EPC]",,N,20241231.0,0002-0213-01,"1 VIAL, MULTI-DOSE in 1 CARTON (0002-0213-01) ..."
1,0002-1152_a820004b-8342-4e58-b733-2d97445b2f5e,0002-1152,HUMAN PRESCRIPTION DRUG,MOUNJARO,,tirzepatide,"INJECTION, SOLUTION",SUBCUTANEOUS,20230728,,...,Eli Lilly and Company,TIRZEPATIDE,2.5,mg/.5mL,"G-Protein-linked Receptor Interactions [MoA], ...",,N,20241231.0,0002-1152-01,"1 VIAL, SINGLE-DOSE in 1 CARTON (0002-1152-01)..."
2,0002-1243_a820004b-8342-4e58-b733-2d97445b2f5e,0002-1243,HUMAN PRESCRIPTION DRUG,MOUNJARO,,tirzepatide,"INJECTION, SOLUTION",SUBCUTANEOUS,20230728,,...,Eli Lilly and Company,TIRZEPATIDE,5.0,mg/.5mL,"G-Protein-linked Receptor Interactions [MoA], ...",,N,20241231.0,0002-1243-01,"1 VIAL, SINGLE-DOSE in 1 CARTON (0002-1243-01)..."
3,0002-1457_a820004b-8342-4e58-b733-2d97445b2f5e,0002-1457,HUMAN PRESCRIPTION DRUG,MOUNJARO,,tirzepatide,"INJECTION, SOLUTION",SUBCUTANEOUS,20220513,,...,Eli Lilly and Company,TIRZEPATIDE,15.0,mg/.5mL,"G-Protein-linked Receptor Interactions [MoA], ...",,N,20241231.0,0002-1457-80,4 SYRINGE in 1 CARTON (0002-1457-80) / .5 mL ...
4,0002-1460_a820004b-8342-4e58-b733-2d97445b2f5e,0002-1460,HUMAN PRESCRIPTION DRUG,MOUNJARO,,tirzepatide,"INJECTION, SOLUTION",SUBCUTANEOUS,20220513,,...,Eli Lilly and Company,TIRZEPATIDE,12.5,mg/.5mL,"G-Protein-linked Receptor Interactions [MoA], ...",,N,20241231.0,0002-1460-80,4 SYRINGE in 1 CARTON (0002-1460-80) / .5 mL ...


In [34]:
#DB_Grp_NDC.to_excel(r'C:\Users\syuan\OneDrive - Fresenius\Desktop\ITS\Python Files\DP_File_Check.xlsx',sheet_name='DC')

In [35]:
### Check and clean
len(DB_Grp_NDC)

3336

In [37]:
### check the count of unmatched codes (sometimes, the code is disontinued, it wont have NDC package code)
empty_ndc_count = DB_Grp_NDC['NDCPACKAGECODE'].isnull().sum()
empty_ndc_count

1

In [38]:
DB_GRP = DB_Grp_NDC.dropna(subset=['NDCPACKAGECODE'], how='any')

In [39]:
len(DB_GRP)

3335