# Introduction
This notebook aims to extract tickers of companies working in the **Biotechnology** industry. In order to do so, a dataset
containing information about all the companies in the US Stock Market was used. This dataset divides companies into 69 industries, including Biotechnology.

[View Initial Dataset on Kaggle] (https://www.kaggle.com/datasets/marketahead/all-us-stocks-tickers-company-info-logos)

# Work Flow 

## Import Packages and Data

In [74]:
import pandas as pd

In [75]:
external_dataset = pd.read_csv('companies.csv')
external_dataset

Unnamed: 0,ticker,company name,short name,industry,description,website,logo,ceo,exchange,market cap,sector,tag 1,tag 2,tag 3
0,A,Agilent Technologies Inc.,Agilent,Medical Diagnostics & Research,Agilent Technologies Inc is engaged in life sc...,http://www.agilent.com,A.png,Michael R. McMullen,New York Stock Exchange,2.421807e+10,Healthcare,Healthcare,Diagnostics & Research,Medical Diagnostics & Research
1,AA,Alcoa Corporation,Alcoa,Metals & Mining,Alcoa Corp is an integrated aluminum company. ...,http://www.alcoa.com,AA.png,Roy Christopher Harvey,New York Stock Exchange,5.374967e+09,Basic Materials,Basic Materials,Aluminum,Metals & Mining
2,AABA,Altaba Inc.,Altaba,Asset Management,"Altaba Inc is an independent, non-diversified,...",http://www.altaba.com,AABA.png,Thomas J. Mcinerney,Nasdaq Global Select,4.122368e+10,Financial Services,Financial Services,Asset Management,
3,AAC,AAC Holdings Inc.,AAC,Health Care Providers,AAC Holdings Inc provides inpatient and outpat...,http://www.americanaddictioncenters.org,,Michael T. Cartwright,New York Stock Exchange,6.372010e+07,Healthcare,Healthcare,Medical Care,Health Care Providers
4,AADR,AdvisorShares Dorsey Wright ADR,AdvisorShares Dorsey Wright,,The investment seeks long-term capital appreci...,http://www.advisorshares.com,AADR.png,,NYSE Arca,1.031612e+08,,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
6363,ZTS,Zoetis Inc. Class A,Zoetis,Drug Manufacturers,Zoetis Inc is a developer and manufacturer of ...,http://www.zoetis.com,ZTS.png,Juan Ramon Alaix,New York Stock Exchange,4.205627e+10,Healthcare,Healthcare,Drug Manufacturers - Specialty & Generic,Drug Manufacturers
6364,ZUMZ,Zumiez Inc.,Zumiez,Retail - Apparel & Specialty,Zumiez Inc is a multi-channel specialty retail...,http://www.zumiez.com,ZUMZ.png,Richard M. Brooks,Nasdaq Global Select,6.150368e+08,Consumer Cyclical,Consumer Cyclical,Specialty Retail,Retail - Apparel & Specialty
6365,ZUO,Zuora Inc. Class A,Zuora,Application Software,Zuora Inc provides cloud-based software on a s...,https://www.zuora.com,ZUO.png,Tien Tzuo,New York Stock Exchange,2.304595e+09,Technology,Technology,Software - Infrastructure,Application Software
6366,ZYME,Zymeworks Inc.,Zymeworks,Biotechnology,Zymeworks Inc is a clinical-stage biopharmaceu...,http://www.zymeworks.com,ZYME.png,Ali Tehrani,New York Stock Exchange,5.042878e+08,Healthcare,Healthcare,Biotechnology,


## Dataset Overview

In [76]:
external_dataset.columns

Index(['ticker', 'company name', 'short name', 'industry', 'description',
       'website', 'logo', 'ceo', 'exchange', 'market cap', 'sector', 'tag 1',
       'tag 2', 'tag 3'],
      dtype='object')

In [77]:
external_dataset.industry.value_counts()

Asset Management        598
Biotechnology           491
Banks                   449
Application Software    305
REITs                   226
                       ... 
Personal Services        12
Health Care Plans        10
Tobacco Products          8
Medical Distribution      7
Truck Manufacturing       6
Name: industry, Length: 69, dtype: int64

As can be seen, there are **491** companies in the Biotechnology industry

## Extract Data

In [78]:
# biotech -> companies whose industry feature equals Biotechnology
biotech = external_dataset.loc[external_dataset.industry == 'Biotechnology']

In [79]:
biotech

Unnamed: 0,ticker,company name,short name,industry,description,website,logo,ceo,exchange,market cap,sector,tag 1,tag 2,tag 3
24,ABEO,Abeona Therapeutics Inc.,Abeona Therapeutics,Biotechnology,Abeona Therapeutics Inc is a clinical-stage bi...,http://www.abeonatherapeutics.com,ABEO.png,Joao Siffert,NASDAQ Capital Market,3.643781e+08,Healthcare,Healthcare,Biotechnology,
28,ABIO,ARCA biopharma Inc.,ARCA biopharma,Biotechnology,ARCA biopharma Inc is a biopharmaceutical comp...,http://www.arcabiopharma.com,,Michael R. Bristow,NASDAQ Capital Market,5.848104e+06,Healthcare,Healthcare,Biotechnology,
34,ABUS,Arbutus Biopharma Corporation,Arbutus Biopharma,Biotechnology,Arbutus Biopharma Corp is a biopharmaceutical ...,http://www.arbutusbio.com,,Mark Murray,Nasdaq Global Select,2.207994e+08,Healthcare,Healthcare,Biotechnology,
37,ACAD,ACADIA Pharmaceuticals Inc.,ACADIA Pharmaceuticals,Biotechnology,ACADIA Pharmaceuticals Inc is a biotechnology ...,http://www.acadia-pharm.com,ACAD.png,Stephen R. Davis,Nasdaq Global Select,2.839192e+09,Healthcare,Healthcare,Biotechnology,
42,ACER,Acer Therapeutics Inc.,Acer Therapeutics,Biotechnology,Acer Therapeutics Inc is a pharmaceutical comp...,https://www.acertx.com,ACER.png,Chris Schelling,NASDAQ Capital Market,2.387585e+08,Healthcare,Healthcare,Biotechnology,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
6348,ZIOP,ZIOPHARM Oncology Inc,ZIOPHARM Oncology,Biotechnology,ZIOPHARM Oncology Inc is a biotechnology compa...,http://www.ziopharm.com,ZIOP.png,Laurence James Neil Cooper,NASDAQ Capital Market,3.583108e+08,Healthcare,Healthcare,Biotechnology,
6352,ZLAB,Zai Lab Limited,Zai Lab Limited,Biotechnology,Zai Lab Ltd is a biopharmaceutical company. Th...,http://www.zailaboratory.com,,Ying Du,NASDAQ Global Market,1.410893e+09,Healthcare,Healthcare,Biotechnology,
6359,ZSAN,Zosano Pharma Corporation,Zosano Pharma,Biotechnology,Zosano Pharma Corp is a clinical-stage special...,http://www.zosanopharma.com,,John P. Walker,NASDAQ Capital Market,2.744220e+07,Healthcare,Healthcare,Biotechnology,
6366,ZYME,Zymeworks Inc.,Zymeworks,Biotechnology,Zymeworks Inc is a clinical-stage biopharmaceu...,http://www.zymeworks.com,ZYME.png,Ali Tehrani,New York Stock Exchange,5.042878e+08,Healthcare,Healthcare,Biotechnology,


In [80]:
# ticker and company name features are the only attributes need!
cols = ['ticker', 'company name']
biotech = biotech[cols]
biotech

Unnamed: 0,ticker,company name
24,ABEO,Abeona Therapeutics Inc.
28,ABIO,ARCA biopharma Inc.
34,ABUS,Arbutus Biopharma Corporation
37,ACAD,ACADIA Pharmaceuticals Inc.
42,ACER,Acer Therapeutics Inc.
...,...,...
6348,ZIOP,ZIOPHARM Oncology Inc
6352,ZLAB,Zai Lab Limited
6359,ZSAN,Zosano Pharma Corporation
6366,ZYME,Zymeworks Inc.


In [83]:
# Reset index to start from 0 
biotech = biotech.reset_index().drop(columns = 'index')
biotech

Unnamed: 0,ticker,company name
0,ABEO,Abeona Therapeutics Inc.
1,ABIO,ARCA biopharma Inc.
2,ABUS,Arbutus Biopharma Corporation
3,ACAD,ACADIA Pharmaceuticals Inc.
4,ACER,Acer Therapeutics Inc.
...,...,...
486,ZIOP,ZIOPHARM Oncology Inc
487,ZLAB,Zai Lab Limited
488,ZSAN,Zosano Pharma Corporation
489,ZYME,Zymeworks Inc.


In [84]:
# Export biotech data frame as a .csv file
biotech.to_csv('biotech_companies.csv')