## India Public Company Financial Information Crawler
##### Last Update: 2022-08-02
#### Author: Wonho Lim 
#### E-mail: wonholim02@gmail.com
#### Python Version: Python 3.10.4 (ipykernel)
#### Chrome Version: Chrome 103.0.5060.134 (64-bit)
#### Chrome Driver Version: ChromeDriver 103.0.5060.134

#### Crawled Website
BSE India MAIN: https://www.bseindia.com/

Description: This is public company financial statement web crawling code for BSE India, formerly known as Bombay Stock Exchange, an official stock exchange website for India. Like SET, BSE provides detailed and officially formatted financial information of public companies. Their server status and structure of financial statements are almost perfect, and not much error occurs. But VPN is required for entrance. I used Nord VPN for the crawling and it was fine. For crawling, as long as the website has not been modified, the code below must be run properly if and only if it is run in an order. Moreover, environment setting must be fixed based on user's computer/server setting and location beforehead. Since the purpose of this code is to collect financial information of public company, this does not collect information of ETFs, Funds, Bonds, and other non-company type entities in BSE india. If one wants to make crawler for those types of entities, entire code must be newly created for accuracy.

### 1. Importing useful open source librabries - utilized BS4 and Selenium Web Driver for crawling

In [2]:
import bs4
import time
import csv 
import pandas as pd 
from platform import python_version
import requests
import lxml 
import xlrd
import selenium
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.support.select import Select
from selenium.webdriver.common.by import By
import csv 
import pandas as pd 
from platform import python_version
import xlsxwriter
from bs4 import BeautifulSoup

### 2. Environment
#### - Chromdriver must be located in local/cloud PC folder.
#### - One can download proper version at: chromedriver.chromium.org/downloads
#### - Path should be modified based on local environment

In [3]:
path = r'C:\Users\user\Desktop\자료\chromedriver'
print(python_version())
print(pd.__version__)

3.9.12
1.4.2


## 3. Main Crawler
#### - Uncollectable or unavailable data was collected as "" for convenience.
#### - Data Cleansing might be required after crawling sometiems, but it will not be tough as one knows how to use excel.
#### - Some directions and explanations are written as comment below.
#### - The order of append must  be depend on the order of column name assigned above.
#### - Lots of try-except function is used to avoid error caused by non-existing pages or information.

### File Setting
#### - xslx file name must be checked based on user preference.
#### - xlsx file will be newly added on the folder where ipynb file is located.
#### - Column can be changed if it is required, but the main crawler code must be also modified for accuracy.
#### - The way we insert the information on xlsx file is different from csv file.
#### - Closing workbook is necessary - it will not save all the data on the sheet if it is not closed at the end.
#### - DO NOT RUN THIS CODE AGAIN AFTER CRAWLING OR THE BLANK FILE WILL REPLACE CURRENT FILE.

### 2020

In [None]:
# Basic File Setting
workbook = xlsxwriter.Workbook('India2020.xlsx')
worksheet = workbook.add_worksheet()
base_url = "https://www.bseindia.com"
corp = "corp-information/"
finan = "financials-results/"
row = 0
column = 0
failList = []
company_links = []

# Data Fields(Column Names) that will be collected
content = ['헤브론스타국가코드','현지언어국가명','영문국가명','시간','대륙','GDP','인구','지역','기업식별코드','현지언어기업명','영문기업명','현지언어한줄소개내용','영문한줄소개내용','현지언어기업소개내용','영문기업소개내용','설립일자','법인등록번호','사업자등록번호','기업대표전화번호','대표팩스번호','대표이메일','기업홈페이지URL','페이스북URL','인스타그램URL','유튜브URL','링크드인URL','트위터핸들','현지언어기업주소','영문기업주소','현지언어기업상세주소','영문기업상세주소','기업우편번호','기업종업원','외감법인구분','기업연수','기업상태','현지언어담당자명','영문담당자명','현지언어직위명','영문직위명','담당자부서명','담당자전화번호','담당자팩스번호','담당자이메일','담당자이동전화번호','회계연도','유동자산금액','비유동자산금액','자산총계금액','유동부채금액','비유동부채금액','부채총계금액','자본총계금액','부채자본총계금액','매출액','매출원가금액','판매비관리비금액','영업이익손실금액','금융수익금액','금융비용금액','기타영업외수익금액','기타영업외비용금액','법인세차감전순이익','법인세비용','당기순이익','영업활동현금흐름금액','투자활동현금흐름금액','재무활동현금흐름금액','기초현금자산금액','기말현금자산금액','부채비율','영업이익율','매출액증가율','영업이익증가율','당기순이익 증가율','기업 CAGR','현지언어산업군명','영문산업군명','현지언어주요제품명내용','영문주요제품명내용','국가언어코드','현지언어언어명','영문언어명','주식시장코드','현지언어주식시장명','영문주식시장명','상장코드','상장일자','주가(일)','주가(1주)','주가(1개월)','주가(6개월)','주가(1년)','주가(3년)','주가(5년)','주가(10년)','거래량','시가총액','지점코드','지점명','주소','주소상세','우편번호','사업자등록번호','이벤트','통화구분코드','화폐단위명','담당자','소스','날짜']

# Put Column names on the sheet
for item in content :
    worksheet.write(row, column, item)
    column += 1
row += 1

# Get EURONEXT Paris Stock Website
driver = webdriver.Chrome(path)
driver.get('https://www.bseindia.com/corporates/List_Scrips.html')
time.sleep(4)
driver.find_element(By.XPATH, value='/html/body/div[4]/div[1]/div[2]/select').click()
time.sleep(1)
driver.find_element(By.XPATH, value='/html/body/div[4]/div[1]/div[2]/select/option[2]').click()
time.sleep(1)
driver.find_element(By.XPATH, value='/html/body/div[4]/div[1]/div[4]/select').click()
time.sleep(1)
driver.find_element(By.XPATH, value='/html/body/div[4]/div[1]/div[4]/select/option[2]').click()
time.sleep(1)
driver.find_element(By.XPATH, value='/html/body/div[4]/div[2]/div[5]/input').click()
time.sleep(10)
text = driver.page_source
time.sleep(1)

# Get each company's links from a tags
soup = bs4.BeautifulSoup(text,'html.parser')
maintable = soup.find('table',{"class":"mGrid ng-scope"})
time.sleep(1)
all_atag_maintable = maintable.find_all('a')
all_atag_maintableHead = all_atag_maintable

driver.find_element(By.XPATH, value='/html/body/div[4]/div[1]/div[4]/select').click()
time.sleep(1)
driver.find_element(By.XPATH, value='/html/body/div[4]/div[1]/div[4]/select/option[3]').click()
time.sleep(1)
driver.find_element(By.XPATH, value='/html/body/div[4]/div[2]/div[5]/input').click()
time.sleep(7)
text2 = driver.page_source
time.sleep(1)

# Get each company's links from a tags
soup2 = bs4.BeautifulSoup(text2,'html.parser')
maintable2 = soup2.find('table',{"class":"mGrid ng-scope"})
time.sleep(1)
all_atag_maintable2 = maintable2.find_all('a')
all_atag_maintableHead2 = all_atag_maintable2

driver.find_element(By.XPATH, value='/html/body/div[4]/div[1]/div[4]/select').click()
time.sleep(1)
driver.find_element(By.XPATH, value='/html/body/div[4]/div[1]/div[4]/select/option[4]').click()
time.sleep(1)
driver.find_element(By.XPATH, value='/html/body/div[4]/div[2]/div[5]/input').click()
time.sleep(7)
text3 = driver.page_source
time.sleep(1)

# Get each company's links from a tags
soup3 = bs4.BeautifulSoup(text3,'html.parser')
maintable3 = soup3.find('table',{"class":"mGrid ng-scope"})
time.sleep(1)
all_atag_maintable3 = maintable3.find_all('a')
all_atag_maintableHead3 = all_atag_maintable3

# Final Link Setting
for a in all_atag_maintableHead:
    company_link = a.attrs["href"]
    company_links.append(company_link)

# Final Link Setting
for a in all_atag_maintableHead2:
    company_link = a.attrs["href"]
    company_links.append(company_link)

# Final Link Setting
for a in all_atag_maintableHead3:
    company_link = a.attrs["href"]
    company_links.append(company_link)

# To check if all companies are selected
# print(len(company_links))

# Main Data Collection code
for sub in company_links:
    # Start from A column every time
    column = 0
    info = []
    try:
        # crawling environment setting
        driver.get(sub)
        time.sleep(2)
        # Company Basic Information Setting
        name = driver.find_element(By.XPATH, value='/html/body/div[1]/div[4]/div[5]/div[2]/div/div[1]/div[1]/div[1]/div[2]/div/h1').text
        try:
            status = driver.find_element(By.XPATH, value='/html/body/div[1]/div[4]/div[5]/div[4]/div[2]/div[5]/div/table/tbody/tr[1]/td[2]').text
        except:
            status = "Public"
        try:
            revenue = driver.find_element(By.XPATH, value='/html/body/div[1]/div[4]/div[5]/div[6]/div/div/ng-view/div[3]/div/div[2]/div/div/div/div[1]/div/div[1]/table/tbody[1]/tr[1]/td[4]').text + "Cr. INR"
        except:
            revenue = ""
        try:
            profit = driver.find_element(By.XPATH, value='/html/body/div[1]/div[4]/div[5]/div[6]/div/div/ng-view/div[3]/div/div[2]/div/div/div/div[1]/div/div[1]/table/tbody[2]/tr[1]/td[4]').text + "Cr. INR"
        except:
            profit = ""
        # Counry Information - Must be investigated individually
        info.append("IND") 
        info.append("India") 
        info.append("India") 
        info.append("UTC+05:30") 
        info.append("아시아") 
        info.append("11745000000000 USD") 
        info.append("1352642280") 
        info.append("남아시아")
        driver.get(sub + corp)
        time.sleep(2)
        # Company Basic Information
        try:
            info.append("HBR" + driver.find_element(By.XPATH, value='/html/body/div[1]/div[4]/div[5]/div[6]/div/div/ng-view/div/div[2]/div/div/table/tbody/tr/td/table/tbody/tr[1]/td[2]').text)   
        except:
            info.append("HBRIND" + name) 
        info.append(name)
        info.append(name)
        try: 
            industry1 = driver.find_element(By.XPATH, value='/html/body/div[1]/div[4]/div[5]/div[6]/div/div/ng-view/div/div[2]/div/div/table/tbody/tr/td/table/tbody/tr[3]/td[2]').text
            industry2 = driver.find_element(By.XPATH, value='/html/body/div[1]/div[4]/div[5]/div[6]/div/div/ng-view/div/div[2]/div/div/table/tbody/tr/td/table/tbody/tr[3]/td[2]').text
        except:
            industry1 = "public company"
            industry2 = "products and services"

        #Descripton, Contact, Address, Extra Information, Management Information
        try: 
            address = driver.find_element(By.XPATH, value='/html/body/div[1]/div[4]/div[5]/div[6]/div/div/ng-view/div/div[6]/div/div/table/tbody/tr/td/table/tbody/tr[1]/td').text
        except: 
            address = "India, South Asia, Asia"

        info.append(name + " (English: " + name + ")" + " is a public company that is listed on Bombay Stock Exchange.")
        info.append(name + " (English: " + name + ")" + " is a public company that is listed on Bombay Stock Exchange.")
        info.append(name + " (English: " + name + ")" + " is a "+ industry2 + " company that is listed on Bombay Stock Exchange. They are providing " + industry2  + " in " + address + ", and they are operated as a public company.")
        info.append(name + " (English: " + name + ")" + " is a "+ industry2 + " company that is listed on Bombay Stock Exchange. They are providing " + industry2  + " in " + address + ", and they are operated as a public company.")
        #설립일자
        try:
            date = driver.find_element(By.XPATH, value='/html/body/div[1]/div[4]/div[5]/div[6]/div/div/ng-view/div/div[2]/div/div/table/tbody/tr/td/table/tbody/tr[7]/td[2]').text
        except:
            date = ""
        try:
            info.append(date) 
        except:
            info.append("")
        try:
            info.append(driver.find_element(By.XPATH, value='/html/body/div[1]/div[4]/div[5]/div[6]/div/div/ng-view/div/div[2]/div/div/table/tbody/tr/td/table/tbody/tr[2]/td[2]').text) 
        except:
            info.append("")
        info.append("")

        #연락처
        try:
            info.append(driver.find_element(By.XPATH, value='/html/body/div[1]/div[4]/div[5]/div[6]/div/div/ng-view/div/div[6]/div/div/table/tbody/tr/td/table/tbody/tr[2]/td[2]').text) 
        except:
            info.append("")
        try:
            info.append(driver.find_element(By.XPATH, value='/html/body/div[1]/div[4]/div[5]/div[6]/div/div/ng-view/div/div[6]/div/div/table/tbody/tr/td/table/tbody/tr[3]/td[2]').text) 
        except:
            info.append("")
        try:
            info.append(driver.find_element(By.XPATH, value='/html/body/div[1]/div[4]/div[5]/div[6]/div/div/ng-view/div/div[6]/div/div/table/tbody/tr/td/table/tbody/tr[4]/td[2]').text) 
        except:
            info.append("")
        try:
            info.append(driver.find_element(By.XPATH, value='/html/body/div[1]/div[4]/div[5]/div[6]/div/div/ng-view/div/div[6]/div/div/table/tbody/tr/td/table/tbody/tr[5]/td[2]').text) 
        except:
            info.append(sub)
        info.append("")
        info.append("")
        info.append("")
        info.append("")
        info.append("")
        info.append(address)
        info.append(address)
        info.append(address)
        info.append(address)
        info.append(address) 

        #종업원수
        info.append("")
        info.append("")
        info.append("")

        #기업상태
        info.append(status)
        #현지언어담당자명
        try: 
            info.append(driver.find_element(By.XPATH, value='/html/body/div[1]/div[4]/div[5]/div[6]/div/div/ng-view/div/div[3]/div/div/table/tbody/tr/td/table/tbody/tr[2]/td[1]').text)
        except:
            info.append("")
        try:
            info.append(driver.find_element(By.XPATH, value='/html/body/div[1]/div[4]/div[5]/div[6]/div/div/ng-view/div/div[3]/div/div/table/tbody/tr/td/table/tbody/tr[2]/td[1]').text)
        except:
            info.append("")
        #직위 및 부서
        try:
            info.append(driver.find_element(By.XPATH, value='/html/body/div[1]/div[4]/div[5]/div[6]/div/div/ng-view/div/div[3]/div/div/table/tbody/tr/td/table/tbody/tr[2]/td[2]').text)
        except:
            info.append("Key Executive")
        try:
            info.append(driver.find_element(By.XPATH, value='/html/body/div[1]/div[4]/div[5]/div[6]/div/div/ng-view/div/div[3]/div/div/table/tbody/tr/td/table/tbody/tr[2]/td[2]').text)
        except:
            info.append("Key Executive")
        info.append("Board of Directors")

        #담당자연락처
        try :
            info.append(driver.find_element(By.XPATH, value='/html/body/div[1]/div[4]/div[5]/div[6]/div/div/ng-view/div/div[6]/div/div/table/tbody/tr/td/table/tbody/tr[2]/td[2]').text) 
        except:
            info.append("")
        try :
            info.append(driver.find_element(By.XPATH, value='/html/body/div[1]/div[4]/div[5]/div[6]/div/div/ng-view/div/div[6]/div/div/table/tbody/tr/td/table/tbody/tr[3]/td[2]').text) 
        except:
            info.append("")
        try :
            info.append(driver.find_element(By.XPATH, value='/html/body/div[1]/div[4]/div[5]/div[6]/div/div/ng-view/div/div[6]/div/div/table/tbody/tr/td/table/tbody/tr[4]/td[2]').text) 
        except:
            info.append("")
        try :
            info.append(driver.find_element(By.XPATH, value='/html/body/div[1]/div[4]/div[5]/div[6]/div/div/ng-view/div/div[6]/div/div/table/tbody/tr/td/table/tbody/tr[2]/td[2]').text) 
        except:
            info.append("")

        # Financial Information
        driver.get(sub + finan)
        time.sleep(2)
        try:
            driver.find_element(By.XPATH, value='/html/body/div/div[4]/div[5]/div[6]/div/div/ng-view/div[2]/div/div/div/div[2]/ul/li[2]/a').click()
            time.sleep(1)
            #회계연도
            try:
                info.append(driver.find_element(By.XPATH, value='/html/body/div/div[4]/div[5]/div[6]/div/div/ng-view/div[2]/div/div/div/div[2]/div/div[2]/table/tbody/tr/td/table[1]/thead/tr/td[3]').text) 
            except:
                try:
                    info.append(driver.find_element(By.XPATH, value='/html/body/div/div[4]/div[5]/div[6]/div/div/ng-view/div[2]/div/div/div/div[2]/div/div[2]/table/tbody/tr/td/table[1]/thead/tr/td[3]').text) 
                except:
                    info.append("TTM")
            #유동자산금액
            info.append("")
            #비유동자산금액
            info.append("")
            #자산총계금액
            info.append("")
            #유동부채금액
            info.append("")
            #비유동부채금액
            info.append("")
            #부채총계금액
            info.append("")
            #자본총계금액
            try:
                info.append(driver.find_element(By.XPATH, value='/html/body/div/div[4]/div[5]/div[6]/div/div/ng-view/div[2]/div/div/div/div[2]/div/div[2]/table/tbody/tr/td/table[1]/tbody/tr[12]/td[3]').text + "Cr. INR")
            except:
                try:
                    info.append(driver.find_element(By.XPATH, value='/html/body/div/div[4]/div[5]/div[6]/div/div/ng-view/div[2]/div/div/div/div[2]/div/div[2]/table/tbody/tr/td/table[1]/tbody/tr[12]/td[3]').text + "Cr. INR")
                except:
                    info.append("")
            #자본부채총계금액
            info.append("")

            #매출액
            try:
                info.append(driver.find_element(By.XPATH, value='/html/body/div/div[4]/div[5]/div[6]/div/div/ng-view/div[2]/div/div/div/div[2]/div/div[2]/table/tbody/tr/td/table[1]/tbody/tr[2]/td[3]').text + "Cr. INR")
            except:
                try:
                    info.append(driver.find_element(By.XPATH, value='/html/body/div/div[4]/div[5]/div[6]/div/div/ng-view/div[2]/div/div/div/div[2]/div/div[2]/table/tbody/tr/td/table[1]/tbody/tr[2]/td[3]').text + "Cr. INR")
                except:
                    info.append(revenue)
            #매출원가금액
            try:
                info.append(driver.find_element(By.XPATH, value='/html/body/div/div[4]/div[5]/div[6]/div/div/ng-view/div[2]/div/div/div/div[2]/div/div[2]/table/tbody/tr/td/table[1]/tbody/tr[5]/td[3]').text + "Cr. INR")
            except:
                try:
                    info.append(driver.find_element(By.XPATH, value='/html/body/div/div[4]/div[5]/div[6]/div/div/ng-view/div[2]/div/div/div/div[2]/div/div[2]/table/tbody/tr/td/table[1]/tbody/tr[5]/td[3]').text + "Cr. INR")
                except:
                    info.append("") 
            #판매비관리비금액
            info.append("")
            #영업이익손실금액
            try:
                info.append(driver.find_element(By.XPATH, value='/html/body/div/div[4]/div[5]/div[6]/div/div/ng-view/div[2]/div/div/div/div[2]/div/div[2]/table/tbody/tr/td/table[1]/tbody/tr[7]/td[3]').text + "Cr. INR")
            except:
                try:
                    info.append(driver.find_element(By.XPATH, value='/html/body/div/div[4]/div[5]/div[6]/div/div/ng-view/div[2]/div/div/div/div[2]/div/div[2]/table/tbody/tr/td/table[1]/tbody/tr[7]/td[3]').text + "Cr. INR")
                except:
                    info.append("")
            #금융수익금액
            try:
                info.append(driver.find_element(By.XPATH, value='/html/body/div/div[4]/div[5]/div[6]/div/div/ng-view/div[2]/div/div/div/div[2]/div/div[2]/table/tbody/tr/td/table[1]/tbody/tr[6]/td[3]').text + "Cr. INR")
            except:
                try:
                    info.append(driver.find_element(By.XPATH, value='/html/body/div/div[4]/div[5]/div[6]/div/div/ng-view/div[2]/div/div/div/div[2]/div/div[2]/table/tbody/tr/td/table[1]/tbody/tr[6]/td[3]').text + "Cr. INR")
                except:
                    info.append("")
            #금융비용금액
            info.append("")
            #기타영업외수익금액
            try:
                info.append(driver.find_element(By.XPATH, value='/html/body/div/div[4]/div[5]/div[6]/div/div/ng-view/div[2]/div/div/div/div[2]/div/div[2]/table/tbody/tr/td/table[1]/tbody/tr[3]/td[3]').text + "Cr. INR")
            except:
                try:
                    info.append(driver.find_element(By.XPATH, value='/html/body/div/div[4]/div[5]/div[6]/div/div/ng-view/div[2]/div/div/div/div[2]/div/div[2]/table/tbody/tr/td/table[1]/tbody/tr[3]/td[3]').text + "Cr. INR")
                except:
                    info.append("") 
            #기타영업외비용금액
            info.append("")

            #법인세차감전순이익
            try:
                info.append(driver.find_element(By.XPATH, value='/html/body/div/div[4]/div[5]/div[6]/div/div/ng-view/div[2]/div/div/div/div[2]/div/div[2]/table/tbody/tr/td/table[1]/tbody/tr[9]/td[3]').text + "Cr. INR")
            except:
                try:
                    info.append(driver.find_element(By.XPATH, value='/html/body/div/div[4]/div[5]/div[6]/div/div/ng-view/div[2]/div/div/div/div[2]/div/div[2]/table/tbody/tr/td/table[1]/tbody/tr[9]/td[3]').text + "Cr. INR")
                except:
                    info.append("")
            #법인세비용
            try:
                info.append(driver.find_element(By.XPATH, value='/html/body/div/div[4]/div[5]/div[6]/div/div/ng-view/div[2]/div/div/div/div[2]/div/div[2]/table/tbody/tr/td/table[1]/tbody/tr[10]/td[3]').text + "Cr. INR")
            except:
                try:
                    info.append(driver.find_element(By.XPATH, value='/html/body/div/div[4]/div[5]/div[6]/div/div/ng-view/div[2]/div/div/div/div[2]/div/div[2]/table/tbody/tr/td/table[1]/tbody/tr[10]/td[3]').text + "Cr. INR")
                except:
                    info.append("")
            #당기순이익
            try:
                info.append(driver.find_element(By.XPATH, value='/html/body/div/div[4]/div[5]/div[6]/div/div/ng-view/div[2]/div/div/div/div[2]/div/div[2]/table/tbody/tr/td/table[1]/tbody/tr[11]/td[3]').text + "Cr. INR")
            except:
                try:
                    info.append(driver.find_element(By.XPATH, value='/html/body/div/div[4]/div[5]/div[6]/div/div/ng-view/div[2]/div/div/div/div[2]/div/div[2]/table/tbody/tr/td/table[1]/tbody/tr[11]/td[3]').text + "Cr. INR")
                except:
                    info.append(profit)

            #현금흐름
            #영업
            info.append("")
            #투자
            info.append("")
            #재무
            info.append("")
            #기초기말
            info.append("")
            info.append("")
            #재무정보중 계산예정
            info.append("")
            info.append("")
            info.append("")
            info.append("")
            info.append("")
            info.append("")

            # Industry and Products      
            info.append(industry1)
            info.append(industry1)
            info.append(industry2)
            info.append(industry2)


            # Language Information
            info.append("ENG")
            info.append("English")
            info.append("English")
        except:
            # Financial Information - Default
            #회계연도
            info.append("TTM")
            #유동자산금액
            info.append("")
            #비유동자산금액
            info.append("")
            #자산총계금액
            info.append("")
            #유동부채금액
            info.append("")
            #비유동부채금액
            info.append("")
            #부채총계금액
            info.append("")
            #자본총계금액
            info.append("")
            #자본부채총계금액
            info.append("")

            #매출액
            info.append(revenue)
            #매출원가금액
            info.append("") 
            #판매비관리비금액
            info.append("")
            #영업이익손실금액
            info.append("")
            #금융수익금액
            info.append("")
            #금융비용금액
            info.append("")
            #기타영업외수익금액
            info.append("") 
            #기타영업외비용금액
            info.append("")
            #법인세차감전순이익
            info.append("")
            #법인세비용
            info.append("")
            #당기순이익
            info.append(profit)

            #현금흐름
            #영업
            info.append("")
            #투자
            info.append("")
            #재무
            info.append("")
            #기초기말
            info.append("")
            info.append("")
            #재무정보중 계산예정
            info.append("")
            info.append("")
            info.append("")
            info.append("")
            info.append("")
            info.append("")

            # Industry and Products       
            info.append(industry1)
            info.append(industry1)
            info.append(industry2)
            info.append(industry2)


            # Language Information
            info.append("ENG")
            info.append("English")
            info.append("English")

        # Stock Market Information
        driver.get(sub)
        time.sleep(2)
        info.append("BSE")
        info.append("Bombay Stock Exchange")
        info.append("Bombay Stock Exchange")
        try:
            info.append(driver.find_element(By.XPATH, value='/html/body/div[1]/div[4]/div[5]/div[6]/div/div/ng-view/div[4]/div/div[2]/div/div/table/tbody[1]/tr[2]/th[2]/strong').text)
        except:
            try:
                info.append(driver.find_element(By.XPATH, value='/html/body/div[1]/div[4]/div[5]/div[2]/div/div[1]/div[1]/div[1]/div[2]/div/div[2]').text)
            except:
                info.append("")
        #상장일자
        info.append(date)

        #주가
        try:
            info.append(driver.find_element(By.XPATH, value='/html/body/div[1]/div[4]/div[5]/div[4]/div[2]/div[1]/div/table/tbody/tr[1]/td[2]').text + "INR")
        except:
            info.append("")
        info.append("")
        info.append("")
        info.append("")
        try:
            info.append(driver.find_element(By.XPATH, value='/html/body/div[1]/div[4]/div[5]/div[4]/div[2]/div[2]/div/table/tbody/tr[1]/td[2]').text + "INR")
        except:
            info.append("")
        info.append("")
        info.append("")
        info.append("")

        #거래량
        info.append("")
        #시가총액
        try:
            info.append(driver.find_element(By.XPATH, value='/html/body/div[1]/div[4]/div[5]/div[4]/div[2]/div[3]/div/table/tbody/tr[5]/td[2]').text + "Cr. INR")
        except:
            info.append("")

        #지점
        info.append("")
        info.append("")
        info.append("India")
        info.append("India, Asia")
        info.append("")
        info.append("")

        # Event
        try:
            info.append(driver.find_element(By.XPATH, value='/html/body/div[1]/div[4]/div[5]/div[6]/div/div/ng-view/div[3]/div/div[1]/div/div/div/div[1]/table/tbody[1]/tr[1]/td[1]/a').text)
        except:
            info.append("")

        # Currency Information
        info.append("INR")
        info.append("Indian Rupee")

        # Management
        info.append("Chris")
        info.append("bseindia.com")
        info.append("2022-07-14")
        
        # Status Check
        print(row)
        print(info)  
        
        # Adding Collected Data
        for item in info :
            worksheet.write(row, column, item)
            column += 1
        row += 1
    except:
        try:
            # crawling environment setting
            print(row)
            info = []
            driver.get(sub)
            time.sleep(2)
            # Company Basic Information
            name = driver.find_element(By.XPATH, value='/html/body/div[1]/div[6]/div/div[3]/div/div[1]/div[1]/div[1]/div[2]/div/h1').text
            status = "Public"
            revenue = ""
            profit = ""
            
            # Counry Information - Must be investigated individually
            info.append("IND") 
            info.append("India") 
            info.append("India") 
            info.append("UTC+05:30") 
            info.append("아시아") 
            info.append("11745000000000 USD") 
            info.append("1352642280") 
            info.append("남아시아")
            
            # Company Code/Name
            info.append("HBRIND" + name) 
            info.append(name)
            info.append(name)
            industry1 = "public company"
            industry2 = "products and services"

            # Industry and Products - Default
            address = "India, South Asia, Asia"

            info.append(name + " (English: " + name + ")" + " is a public company that is listed on Bombay Stock Exchange.")
            info.append(name + " (English: " + name + ")" + " is a public company that is listed on Bombay Stock Exchange.")
            info.append(name + " (English: " + name + ")" + " is a "+ industry2 + " company that is listed on Bombay Stock Exchange. They are providing " + industry2  + " in " + address + ", and they are operated as a public company.")
            info.append(name + " (English: " + name + ")" + " is a "+ industry2 + " company that is listed on Bombay Stock Exchange. They are providing " + industry2  + " in " + address + ", and they are operated as a public company.")
            #설립일자
            date = ""
            info.append("")
            info.append("")
            info.append("")

            #연락처
            info.append("")
            info.append("")
            info.append("")
            info.append(sub)
            info.append("")
            info.append("")
            info.append("")
            info.append("")
            info.append("")
            info.append(address)
            info.append(address)
            info.append(address)
            info.append(address)
            info.append(address) 

            #종업원수
            info.append("")
            info.append("")
            info.append("")

            #기업상태
            info.append(status)
            #현지언어담당자명
            info.append("")
            info.append("")
            #직위 및 부서
            info.append("Key Executive")
            info.append("Key Executive")
            info.append("Board of Directors")

            #담당자연락처
            info.append("")
            info.append("")
            info.append("")
            info.append("")

            #재무정보

            #회계연도
            info.append("TTM")
            #유동자산금액
            info.append("")
            #비유동자산금액
            info.append("")
            #자산총계금액
            info.append("")
            #유동부채금액
            info.append("")
            #비유동부채금액
            info.append("")
            #부채총계금액
            info.append("")
            #자본총계금액
            info.append("")
            #자본부채총계금액
            info.append("")

            #매출액
            info.append(revenue)
            #매출원가금액
            info.append("") 
            #판매비관리비금액
            info.append("")
            #영업이익손실금액
            info.append("")
            #금융수익금액
            info.append("")
            #금융비용금액
            info.append("")
            #기타영업외수익금액
            info.append("") 
            #기타영업외비용금액
            info.append("")

            #법인세차감전순이익
            info.append("")
            #법인세비용
            info.append("")
            #당기순이익
            info.append(profit)

            #현금흐름
            #영업
            info.append("")
            #투자
            info.append("")
            #재무
            info.append("")
            #기초기말
            info.append("")
            info.append("")
            #재무정보중 계산예정
            info.append("")
            info.append("")
            info.append("")
            info.append("")
            info.append("")
            info.append("")

            #산업군       
            info.append(industry1)
            info.append(industry1)
            info.append(industry2)
            info.append(industry2)

            # Language Information
            info.append("ENG")
            info.append("English")
            info.append("English")

            # Stock Market Information - Default
            info.append("BSE")
            info.append("Bombay Stock Exchange")
            info.append("Bombay Stock Exchange")
            info.append("")
            
            #상장일자
            info.append(date)

            #주가
            info.append("")
            info.append("")
            info.append("")
            info.append("")
            info.append("")
            info.append("")
            info.append("")
            info.append("")

            #거래량
            info.append("")
            #시가총액
            info.append("")

            #지점
            info.append("")
            info.append("")
            info.append("India")
            info.append("India, Asia")
            info.append("")
            info.append("")

            # Event - Default
            info.append("")

            # Currency Information
            info.append("INR")
            info.append("Indian Rupee")

             # For Status Checking
            info.append("Chris")
            info.append("bseindia.com")
            info.append("2022-07-14")
            
            #Adding Collected Data
            print(info)
            for item in info :
                worksheet.write(row, column, item)
                column += 1
            row += 1
            
        # If not possible, add link to the failList
        except: 
            failList.append(sub)
            
# Check failed result 
# - There are many ways to deal with failList: retry, ignore, manually add, find why error has occured.
# - This will be up to user's choice.
print("failLength:")
print(len(failList))
print(failList)

# Closing workbook is necessary - it will not save all the data on the sheet if it is not closed at the end.
workbook.close()

### 2019

In [None]:
# Basic File Setting
workbook = xlsxwriter.Workbook('India2019.xlsx')
worksheet = workbook.add_worksheet()
base_url = "https://www.bseindia.com"
corp = "corp-information/"
finan = "financials-results/"
row = 0
column = 0
company_links = []

# Data Fields(Column Names) that will be collected
content = ['헤브론스타국가코드','현지언어국가명','영문국가명','시간','대륙','GDP','인구','지역','기업식별코드','현지언어기업명','영문기업명','현지언어한줄소개내용','영문한줄소개내용','현지언어기업소개내용','영문기업소개내용','설립일자','법인등록번호','사업자등록번호','기업대표전화번호','대표팩스번호','대표이메일','기업홈페이지URL','페이스북URL','인스타그램URL','유튜브URL','링크드인URL','트위터핸들','현지언어기업주소','영문기업주소','현지언어기업상세주소','영문기업상세주소','기업우편번호','기업종업원','외감법인구분','기업연수','기업상태','현지언어담당자명','영문담당자명','현지언어직위명','영문직위명','담당자부서명','담당자전화번호','담당자팩스번호','담당자이메일','담당자이동전화번호','회계연도','유동자산금액','비유동자산금액','자산총계금액','유동부채금액','비유동부채금액','부채총계금액','자본총계금액','부채자본총계금액','매출액','매출원가금액','판매비관리비금액','영업이익손실금액','금융수익금액','금융비용금액','기타영업외수익금액','기타영업외비용금액','법인세차감전순이익','법인세비용','당기순이익','영업활동현금흐름금액','투자활동현금흐름금액','재무활동현금흐름금액','기초현금자산금액','기말현금자산금액','부채비율','영업이익율','매출액증가율','영업이익증가율','당기순이익 증가율','기업 CAGR','현지언어산업군명','영문산업군명','현지언어주요제품명내용','영문주요제품명내용','국가언어코드','현지언어언어명','영문언어명','주식시장코드','현지언어주식시장명','영문주식시장명','상장코드','상장일자','주가(일)','주가(1주)','주가(1개월)','주가(6개월)','주가(1년)','주가(3년)','주가(5년)','주가(10년)','거래량','시가총액','지점코드','지점명','주소','주소상세','우편번호','사업자등록번호','이벤트','통화구분코드','화폐단위명','담당자','소스','날짜']

# Put Column names on the sheet
for item in content :
    worksheet.write(row, column, item)
    column += 1
row += 1

# Get EURONEXT Paris Stock Website
driver = webdriver.Chrome(path)
driver.get('https://www.bseindia.com/corporates/List_Scrips.html')
time.sleep(4)
driver.find_element(By.XPATH, value='/html/body/div[4]/div[1]/div[2]/select').click()
time.sleep(1)
driver.find_element(By.XPATH, value='/html/body/div[4]/div[1]/div[2]/select/option[2]').click()
time.sleep(1)
driver.find_element(By.XPATH, value='/html/body/div[4]/div[1]/div[4]/select').click()
time.sleep(1)
driver.find_element(By.XPATH, value='/html/body/div[4]/div[1]/div[4]/select/option[2]').click()
time.sleep(1)
driver.find_element(By.XPATH, value='/html/body/div[4]/div[2]/div[5]/input').click()
time.sleep(10)
text = driver.page_source
time.sleep(1)

# Get each company's links from a tags
soup = bs4.BeautifulSoup(text,'html.parser')
maintable = soup.find('table',{"class":"mGrid ng-scope"})
time.sleep(1)
all_atag_maintable = maintable.find_all('a')
all_atag_maintableHead = all_atag_maintable

driver.find_element(By.XPATH, value='/html/body/div[4]/div[1]/div[4]/select').click()
time.sleep(1)
driver.find_element(By.XPATH, value='/html/body/div[4]/div[1]/div[4]/select/option[3]').click()
time.sleep(1)
driver.find_element(By.XPATH, value='/html/body/div[4]/div[2]/div[5]/input').click()
time.sleep(7)
text2 = driver.page_source
time.sleep(1)

# Get each company's links from a tags
soup2 = bs4.BeautifulSoup(text2,'html.parser')
maintable2 = soup2.find('table',{"class":"mGrid ng-scope"})
time.sleep(1)
all_atag_maintable2 = maintable2.find_all('a')
all_atag_maintableHead2 = all_atag_maintable2

driver.find_element(By.XPATH, value='/html/body/div[4]/div[1]/div[4]/select').click()
time.sleep(1)
driver.find_element(By.XPATH, value='/html/body/div[4]/div[1]/div[4]/select/option[4]').click()
time.sleep(1)
driver.find_element(By.XPATH, value='/html/body/div[4]/div[2]/div[5]/input').click()
time.sleep(7)
text3 = driver.page_source
time.sleep(1)

# Get each company's links from a tags
soup3 = bs4.BeautifulSoup(text3,'html.parser')
maintable3 = soup3.find('table',{"class":"mGrid ng-scope"})
time.sleep(1)
all_atag_maintable3 = maintable3.find_all('a')
all_atag_maintableHead3 = all_atag_maintable3

# Final Link Setting
for a in all_atag_maintableHead:
    company_link = a.attrs["href"]
    company_links.append(company_link)

# Final Link Setting
for a in all_atag_maintableHead2:
    company_link = a.attrs["href"]
    company_links.append(company_link)

# Final Link Setting
for a in all_atag_maintableHead3:
    company_link = a.attrs["href"]
    company_links.append(company_link)

# To check if all companies are selected
# print(len(company_links))

# Main Data Collection code
for sub in company_links:
    # Start from A column every time
    column = 0
    info = []
    try:
        # crawling environment setting
        driver.get(sub)
        time.sleep(2)
        # Company Basic Information Setting
        name = driver.find_element(By.XPATH, value='/html/body/div[1]/div[4]/div[5]/div[2]/div/div[1]/div[1]/div[1]/div[2]/div/h1').text
        try:
            status = driver.find_element(By.XPATH, value='/html/body/div[1]/div[4]/div[5]/div[4]/div[2]/div[5]/div/table/tbody/tr[1]/td[2]').text
        except:
            status = "Public"
        try:
            revenue = driver.find_element(By.XPATH, value='/html/body/div[1]/div[4]/div[5]/div[6]/div/div/ng-view/div[3]/div/div[2]/div/div/div/div[1]/div/div[1]/table/tbody[1]/tr[1]/td[4]').text + "Cr. INR"
        except:
            revenue = ""
        try:
            profit = driver.find_element(By.XPATH, value='/html/body/div[1]/div[4]/div[5]/div[6]/div/div/ng-view/div[3]/div/div[2]/div/div/div/div[1]/div/div[1]/table/tbody[2]/tr[1]/td[4]').text + "Cr. INR"
        except:
            profit = ""
        # Counry Information - Must be investigated individually
        info.append("IND") 
        info.append("India") 
        info.append("India") 
        info.append("UTC+05:30") 
        info.append("아시아") 
        info.append("11745000000000 USD") 
        info.append("1352642280") 
        info.append("남아시아")
        driver.get(sub + corp)
        time.sleep(2)
        # Company Basic Information
        try:
            info.append("HBR" + driver.find_element(By.XPATH, value='/html/body/div[1]/div[4]/div[5]/div[6]/div/div/ng-view/div/div[2]/div/div/table/tbody/tr/td/table/tbody/tr[1]/td[2]').text)   
        except:
            info.append("HBRIND" + name) 
        info.append(name)
        info.append(name)
        try: 
            industry1 = driver.find_element(By.XPATH, value='/html/body/div[1]/div[4]/div[5]/div[6]/div/div/ng-view/div/div[2]/div/div/table/tbody/tr/td/table/tbody/tr[3]/td[2]').text
            industry2 = driver.find_element(By.XPATH, value='/html/body/div[1]/div[4]/div[5]/div[6]/div/div/ng-view/div/div[2]/div/div/table/tbody/tr/td/table/tbody/tr[3]/td[2]').text
        except:
            industry1 = "public company"
            industry2 = "products and services"

        #Descripton, Contact, Address, Extra Information, Management Information
        try: 
            address = driver.find_element(By.XPATH, value='/html/body/div[1]/div[4]/div[5]/div[6]/div/div/ng-view/div/div[6]/div/div/table/tbody/tr/td/table/tbody/tr[1]/td').text
        except: 
            address = "India, South Asia, Asia"

        info.append(name + " (English: " + name + ")" + " is a public company that is listed on Bombay Stock Exchange.")
        info.append(name + " (English: " + name + ")" + " is a public company that is listed on Bombay Stock Exchange.")
        info.append(name + " (English: " + name + ")" + " is a "+ industry2 + " company that is listed on Bombay Stock Exchange. They are providing " + industry2  + " in " + address + ", and they are operated as a public company.")
        info.append(name + " (English: " + name + ")" + " is a "+ industry2 + " company that is listed on Bombay Stock Exchange. They are providing " + industry2  + " in " + address + ", and they are operated as a public company.")
        #설립일자
        try:
            date = driver.find_element(By.XPATH, value='/html/body/div[1]/div[4]/div[5]/div[6]/div/div/ng-view/div/div[2]/div/div/table/tbody/tr/td/table/tbody/tr[7]/td[2]').text
        except:
            date = ""
        try:
            info.append(date) 
        except:
            info.append("")
        try:
            info.append(driver.find_element(By.XPATH, value='/html/body/div[1]/div[4]/div[5]/div[6]/div/div/ng-view/div/div[2]/div/div/table/tbody/tr/td/table/tbody/tr[2]/td[2]').text) 
        except:
            info.append("")
        info.append("")

        #연락처
        try:
            info.append(driver.find_element(By.XPATH, value='/html/body/div[1]/div[4]/div[5]/div[6]/div/div/ng-view/div/div[6]/div/div/table/tbody/tr/td/table/tbody/tr[2]/td[2]').text) 
        except:
            info.append("")
        try:
            info.append(driver.find_element(By.XPATH, value='/html/body/div[1]/div[4]/div[5]/div[6]/div/div/ng-view/div/div[6]/div/div/table/tbody/tr/td/table/tbody/tr[3]/td[2]').text) 
        except:
            info.append("")
        try:
            info.append(driver.find_element(By.XPATH, value='/html/body/div[1]/div[4]/div[5]/div[6]/div/div/ng-view/div/div[6]/div/div/table/tbody/tr/td/table/tbody/tr[4]/td[2]').text) 
        except:
            info.append("")
        try:
            info.append(driver.find_element(By.XPATH, value='/html/body/div[1]/div[4]/div[5]/div[6]/div/div/ng-view/div/div[6]/div/div/table/tbody/tr/td/table/tbody/tr[5]/td[2]').text) 
        except:
            info.append(sub)
        info.append("")
        info.append("")
        info.append("")
        info.append("")
        info.append("")
        info.append(address)
        info.append(address)
        info.append(address)
        info.append(address)
        info.append(address) 

        #종업원수
        info.append("")
        info.append("")
        info.append("")

        #기업상태
        info.append(status)
        #현지언어담당자명
        try: 
            info.append(driver.find_element(By.XPATH, value='/html/body/div[1]/div[4]/div[5]/div[6]/div/div/ng-view/div/div[3]/div/div/table/tbody/tr/td/table/tbody/tr[2]/td[1]').text)
        except:
            info.append("")
        try:
            info.append(driver.find_element(By.XPATH, value='/html/body/div[1]/div[4]/div[5]/div[6]/div/div/ng-view/div/div[3]/div/div/table/tbody/tr/td/table/tbody/tr[2]/td[1]').text)
        except:
            info.append("")
        #직위 및 부서
        try:
            info.append(driver.find_element(By.XPATH, value='/html/body/div[1]/div[4]/div[5]/div[6]/div/div/ng-view/div/div[3]/div/div/table/tbody/tr/td/table/tbody/tr[2]/td[2]').text)
        except:
            info.append("Key Executive")
        try:
            info.append(driver.find_element(By.XPATH, value='/html/body/div[1]/div[4]/div[5]/div[6]/div/div/ng-view/div/div[3]/div/div/table/tbody/tr/td/table/tbody/tr[2]/td[2]').text)
        except:
            info.append("Key Executive")
        info.append("Board of Directors")

        #담당자연락처
        try :
            info.append(driver.find_element(By.XPATH, value='/html/body/div[1]/div[4]/div[5]/div[6]/div/div/ng-view/div/div[6]/div/div/table/tbody/tr/td/table/tbody/tr[2]/td[2]').text) 
        except:
            info.append("")
        try :
            info.append(driver.find_element(By.XPATH, value='/html/body/div[1]/div[4]/div[5]/div[6]/div/div/ng-view/div/div[6]/div/div/table/tbody/tr/td/table/tbody/tr[3]/td[2]').text) 
        except:
            info.append("")
        try :
            info.append(driver.find_element(By.XPATH, value='/html/body/div[1]/div[4]/div[5]/div[6]/div/div/ng-view/div/div[6]/div/div/table/tbody/tr/td/table/tbody/tr[4]/td[2]').text) 
        except:
            info.append("")
        try :
            info.append(driver.find_element(By.XPATH, value='/html/body/div[1]/div[4]/div[5]/div[6]/div/div/ng-view/div/div[6]/div/div/table/tbody/tr/td/table/tbody/tr[2]/td[2]').text) 
        except:
            info.append("")

        # Financial Information
        driver.get(sub + finan)
        time.sleep(2)
        try:
            driver.find_element(By.XPATH, value='/html/body/div/div[4]/div[5]/div[6]/div/div/ng-view/div[2]/div/div/div/div[2]/ul/li[2]/a').click()
            time.sleep(1)
            #회계연도
            try:
                info.append(driver.find_element(By.XPATH, value='/html/body/div/div[4]/div[5]/div[6]/div/div/ng-view/div[2]/div/div/div/div[2]/div/div[2]/table/tbody/tr/td/table[1]/thead/tr/td[4]').text) 
            except:
                try:
                    info.append(driver.find_element(By.XPATH, value='/html/body/div/div[4]/div[5]/div[6]/div/div/ng-view/div[2]/div/div/div/div[2]/div/div[2]/table/tbody/tr/td/table[1]/thead/tr/td[3]').text) 
                except:
                    info.append("TTM")
            #유동자산금액
            info.append("")
            #비유동자산금액
            info.append("")
            #자산총계금액
            info.append("")
            #유동부채금액
            info.append("")
            #비유동부채금액
            info.append("")
            #부채총계금액
            info.append("")
            #자본총계금액
            try:
                info.append(driver.find_element(By.XPATH, value='/html/body/div/div[4]/div[5]/div[6]/div/div/ng-view/div[2]/div/div/div/div[2]/div/div[2]/table/tbody/tr/td/table[1]/tbody/tr[12]/td[4]').text + "Cr. INR")
            except:
                try:
                    info.append(driver.find_element(By.XPATH, value='/html/body/div/div[4]/div[5]/div[6]/div/div/ng-view/div[2]/div/div/div/div[2]/div/div[2]/table/tbody/tr/td/table[1]/tbody/tr[12]/td[3]').text + "Cr. INR")
                except:
                    info.append("")
            #자본부채총계금액
            info.append("")

            #매출액
            try:
                info.append(driver.find_element(By.XPATH, value='/html/body/div/div[4]/div[5]/div[6]/div/div/ng-view/div[2]/div/div/div/div[2]/div/div[2]/table/tbody/tr/td/table[1]/tbody/tr[2]/td[4]').text + "Cr. INR")
            except:
                try:
                    info.append(driver.find_element(By.XPATH, value='/html/body/div/div[4]/div[5]/div[6]/div/div/ng-view/div[2]/div/div/div/div[2]/div/div[2]/table/tbody/tr/td/table[1]/tbody/tr[2]/td[4]').text + "Cr. INR")
                except:
                    info.append(revenue)
            #매출원가금액
            try:
                info.append(driver.find_element(By.XPATH, value='/html/body/div/div[4]/div[5]/div[6]/div/div/ng-view/div[2]/div/div/div/div[2]/div/div[2]/table/tbody/tr/td/table[1]/tbody/tr[5]/td[4]').text + "Cr. INR")
            except:
                try:
                    info.append(driver.find_element(By.XPATH, value='/html/body/div/div[4]/div[5]/div[6]/div/div/ng-view/div[2]/div/div/div/div[2]/div/div[2]/table/tbody/tr/td/table[1]/tbody/tr[5]/td[3]').text + "Cr. INR")
                except:
                    info.append("") 
            #판매비관리비금액
            info.append("")
            #영업이익손실금액
            try:
                info.append(driver.find_element(By.XPATH, value='/html/body/div/div[4]/div[5]/div[6]/div/div/ng-view/div[2]/div/div/div/div[2]/div/div[2]/table/tbody/tr/td/table[1]/tbody/tr[7]/td[4]').text + "Cr. INR")
            except:
                try:
                    info.append(driver.find_element(By.XPATH, value='/html/body/div/div[4]/div[5]/div[6]/div/div/ng-view/div[2]/div/div/div/div[2]/div/div[2]/table/tbody/tr/td/table[1]/tbody/tr[7]/td[3]').text + "Cr. INR")
                except:
                    info.append("")
            #금융수익금액
            try:
                info.append(driver.find_element(By.XPATH, value='/html/body/div/div[4]/div[5]/div[6]/div/div/ng-view/div[2]/div/div/div/div[2]/div/div[2]/table/tbody/tr/td/table[1]/tbody/tr[6]/td[4]').text + "Cr. INR")
            except:
                try:
                    info.append(driver.find_element(By.XPATH, value='/html/body/div/div[4]/div[5]/div[6]/div/div/ng-view/div[2]/div/div/div/div[2]/div/div[2]/table/tbody/tr/td/table[1]/tbody/tr[6]/td[3]').text + "Cr. INR")
                except:
                    info.append("")
            #금융비용금액
            info.append("")
            #기타영업외수익금액
            try:
                info.append(driver.find_element(By.XPATH, value='/html/body/div/div[4]/div[5]/div[6]/div/div/ng-view/div[2]/div/div/div/div[2]/div/div[2]/table/tbody/tr/td/table[1]/tbody/tr[3]/td[4]').text + "Cr. INR")
            except:
                try:
                    info.append(driver.find_element(By.XPATH, value='/html/body/div/div[4]/div[5]/div[6]/div/div/ng-view/div[2]/div/div/div/div[2]/div/div[2]/table/tbody/tr/td/table[1]/tbody/tr[3]/td[3]').text + "Cr. INR")
                except:
                    info.append("") 
            #기타영업외비용금액
            info.append("")

            #법인세차감전순이익
            try:
                info.append(driver.find_element(By.XPATH, value='/html/body/div/div[4]/div[5]/div[6]/div/div/ng-view/div[2]/div/div/div/div[2]/div/div[2]/table/tbody/tr/td/table[1]/tbody/tr[9]/td[4]').text + "Cr. INR")
            except:
                try:
                    info.append(driver.find_element(By.XPATH, value='/html/body/div/div[4]/div[5]/div[6]/div/div/ng-view/div[2]/div/div/div/div[2]/div/div[2]/table/tbody/tr/td/table[1]/tbody/tr[9]/td[3]').text + "Cr. INR")
                except:
                    info.append("")
            #법인세비용
            try:
                info.append(driver.find_element(By.XPATH, value='/html/body/div/div[4]/div[5]/div[6]/div/div/ng-view/div[2]/div/div/div/div[2]/div/div[2]/table/tbody/tr/td/table[1]/tbody/tr[10]/td[4]').text + "Cr. INR")
            except:
                try:
                    info.append(driver.find_element(By.XPATH, value='/html/body/div/div[4]/div[5]/div[6]/div/div/ng-view/div[2]/div/div/div/div[2]/div/div[2]/table/tbody/tr/td/table[1]/tbody/tr[10]/td[3]').text + "Cr. INR")
                except:
                    info.append("")
            #당기순이익
            try:
                info.append(driver.find_element(By.XPATH, value='/html/body/div/div[4]/div[5]/div[6]/div/div/ng-view/div[2]/div/div/div/div[2]/div/div[2]/table/tbody/tr/td/table[1]/tbody/tr[11]/td[4]').text + "Cr. INR")
            except:
                try:
                    info.append(driver.find_element(By.XPATH, value='/html/body/div/div[4]/div[5]/div[6]/div/div/ng-view/div[2]/div/div/div/div[2]/div/div[2]/table/tbody/tr/td/table[1]/tbody/tr[11]/td[3]').text + "Cr. INR")
                except:
                    info.append(profit)

            #현금흐름
            #영업
            info.append("")
            #투자
            info.append("")
            #재무
            info.append("")
            #기초기말
            info.append("")
            info.append("")
            #재무정보중 계산예정
            info.append("")
            info.append("")
            info.append("")
            info.append("")
            info.append("")
            info.append("")

            # Industry and Products      
            info.append(industry1)
            info.append(industry1)
            info.append(industry2)
            info.append(industry2)


            # Language Information
            info.append("ENG")
            info.append("English")
            info.append("English")
        except:
            # Financial Information - Default
            #회계연도
            info.append("TTM")
            #유동자산금액
            info.append("")
            #비유동자산금액
            info.append("")
            #자산총계금액
            info.append("")
            #유동부채금액
            info.append("")
            #비유동부채금액
            info.append("")
            #부채총계금액
            info.append("")
            #자본총계금액
            info.append("")
            #자본부채총계금액
            info.append("")

            #매출액
            info.append(revenue)
            #매출원가금액
            info.append("") 
            #판매비관리비금액
            info.append("")
            #영업이익손실금액
            info.append("")
            #금융수익금액
            info.append("")
            #금융비용금액
            info.append("")
            #기타영업외수익금액
            info.append("") 
            #기타영업외비용금액
            info.append("")
            #법인세차감전순이익
            info.append("")
            #법인세비용
            info.append("")
            #당기순이익
            info.append(profit)

            #현금흐름
            #영업
            info.append("")
            #투자
            info.append("")
            #재무
            info.append("")
            #기초기말
            info.append("")
            info.append("")
            #재무정보중 계산예정
            info.append("")
            info.append("")
            info.append("")
            info.append("")
            info.append("")
            info.append("")

            # Industry and Products       
            info.append(industry1)
            info.append(industry1)
            info.append(industry2)
            info.append(industry2)


            # Language Information
            info.append("ENG")
            info.append("English")
            info.append("English")

        # Stock Market Information
        driver.get(sub)
        time.sleep(2)
        info.append("BSE")
        info.append("Bombay Stock Exchange")
        info.append("Bombay Stock Exchange")
        try:
            info.append(driver.find_element(By.XPATH, value='/html/body/div[1]/div[4]/div[5]/div[6]/div/div/ng-view/div[4]/div/div[2]/div/div/table/tbody[1]/tr[2]/th[2]/strong').text)
        except:
            try:
                info.append(driver.find_element(By.XPATH, value='/html/body/div[1]/div[4]/div[5]/div[2]/div/div[1]/div[1]/div[1]/div[2]/div/div[2]').text)
            except:
                info.append("")
        #상장일자
        info.append(date)

        #주가
        try:
            info.append(driver.find_element(By.XPATH, value='/html/body/div[1]/div[4]/div[5]/div[4]/div[2]/div[1]/div/table/tbody/tr[1]/td[2]').text + "INR")
        except:
            info.append("")
        info.append("")
        info.append("")
        info.append("")
        try:
            info.append(driver.find_element(By.XPATH, value='/html/body/div[1]/div[4]/div[5]/div[4]/div[2]/div[2]/div/table/tbody/tr[1]/td[2]').text + "INR")
        except:
            info.append("")
        info.append("")
        info.append("")
        info.append("")

        #거래량
        info.append("")
        #시가총액
        try:
            info.append(driver.find_element(By.XPATH, value='/html/body/div[1]/div[4]/div[5]/div[4]/div[2]/div[3]/div/table/tbody/tr[5]/td[2]').text + "Cr. INR")
        except:
            info.append("")

        #지점
        info.append("")
        info.append("")
        info.append("India")
        info.append("India, Asia")
        info.append("")
        info.append("")

        # Event
        try:
            info.append(driver.find_element(By.XPATH, value='/html/body/div[1]/div[4]/div[5]/div[6]/div/div/ng-view/div[3]/div/div[1]/div/div/div/div[1]/table/tbody[1]/tr[1]/td[1]/a').text)
        except:
            info.append("")

        # Currency Information
        info.append("INR")
        info.append("Indian Rupee")

        # Management
        info.append("Chris")
        info.append("bseindia.com")
        info.append("2022-07-14")
        
        # Status Check
        print(row)
        print(info)  
        
        # Adding Collected Data
        for item in info :
            worksheet.write(row, column, item)
            column += 1
        row += 1
    except:
        try:
            # crawling environment setting
            print(row)
            info = []
            driver.get(sub)
            time.sleep(2)
            # Company Basic Information
            name = driver.find_element(By.XPATH, value='/html/body/div[1]/div[6]/div/div[3]/div/div[1]/div[1]/div[1]/div[2]/div/h1').text
            status = "Public"
            revenue = ""
            profit = ""
            
            # Counry Information - Must be investigated individually
            info.append("IND") 
            info.append("India") 
            info.append("India") 
            info.append("UTC+05:30") 
            info.append("아시아") 
            info.append("11745000000000 USD") 
            info.append("1352642280") 
            info.append("남아시아")
            
            # Company Code/Name
            info.append("HBRIND" + name) 
            info.append(name)
            info.append(name)
            industry1 = "public company"
            industry2 = "products and services"

            # Industry and Products - Default
            address = "India, South Asia, Asia"

            info.append(name + " (English: " + name + ")" + " is a public company that is listed on Bombay Stock Exchange.")
            info.append(name + " (English: " + name + ")" + " is a public company that is listed on Bombay Stock Exchange.")
            info.append(name + " (English: " + name + ")" + " is a "+ industry2 + " company that is listed on Bombay Stock Exchange. They are providing " + industry2  + " in " + address + ", and they are operated as a public company.")
            info.append(name + " (English: " + name + ")" + " is a "+ industry2 + " company that is listed on Bombay Stock Exchange. They are providing " + industry2  + " in " + address + ", and they are operated as a public company.")
            #설립일자
            date = ""
            info.append("")
            info.append("")
            info.append("")

            #연락처
            info.append("")
            info.append("")
            info.append("")
            info.append(sub)
            info.append("")
            info.append("")
            info.append("")
            info.append("")
            info.append("")
            info.append(address)
            info.append(address)
            info.append(address)
            info.append(address)
            info.append(address) 

            #종업원수
            info.append("")
            info.append("")
            info.append("")

            #기업상태
            info.append(status)
            #현지언어담당자명
            info.append("")
            info.append("")
            #직위 및 부서
            info.append("Key Executive")
            info.append("Key Executive")
            info.append("Board of Directors")

            #담당자연락처
            info.append("")
            info.append("")
            info.append("")
            info.append("")

            #재무정보

            #회계연도
            info.append("TTM")
            #유동자산금액
            info.append("")
            #비유동자산금액
            info.append("")
            #자산총계금액
            info.append("")
            #유동부채금액
            info.append("")
            #비유동부채금액
            info.append("")
            #부채총계금액
            info.append("")
            #자본총계금액
            info.append("")
            #자본부채총계금액
            info.append("")

            #매출액
            info.append(revenue)
            #매출원가금액
            info.append("") 
            #판매비관리비금액
            info.append("")
            #영업이익손실금액
            info.append("")
            #금융수익금액
            info.append("")
            #금융비용금액
            info.append("")
            #기타영업외수익금액
            info.append("") 
            #기타영업외비용금액
            info.append("")

            #법인세차감전순이익
            info.append("")
            #법인세비용
            info.append("")
            #당기순이익
            info.append(profit)

            #현금흐름
            #영업
            info.append("")
            #투자
            info.append("")
            #재무
            info.append("")
            #기초기말
            info.append("")
            info.append("")
            #재무정보중 계산예정
            info.append("")
            info.append("")
            info.append("")
            info.append("")
            info.append("")
            info.append("")

            #산업군       
            info.append(industry1)
            info.append(industry1)
            info.append(industry2)
            info.append(industry2)

            # Language Information
            info.append("ENG")
            info.append("English")
            info.append("English")

            # Stock Market Information - Default
            info.append("BSE")
            info.append("Bombay Stock Exchange")
            info.append("Bombay Stock Exchange")
            info.append("")
            
            #상장일자
            info.append(date)

            #주가
            info.append("")
            info.append("")
            info.append("")
            info.append("")
            info.append("")
            info.append("")
            info.append("")
            info.append("")

            #거래량
            info.append("")
            #시가총액
            info.append("")

            #지점
            info.append("")
            info.append("")
            info.append("India")
            info.append("India, Asia")
            info.append("")
            info.append("")

            # Event - Default
            info.append("")

            # Currency Information
            info.append("INR")
            info.append("Indian Rupee")

             # For Status Checking
            info.append("Chris")
            info.append("bseindia.com")
            info.append("2022-07-14")
            
            #Adding Collected Data
            print(info)
            for item in info :
                worksheet.write(row, column, item)
                column += 1
            row += 1
            
        # If not possible, add link to the failList
        except: 
            failList.append(sub)
            
# Check failed result 
# - There are many ways to deal with failList: retry, ignore, manually add, find why error has occured.
# - This will be up to user's choice.
print("failLength:")
print(len(failList))
print(failList)

# Closing workbook is necessary - it will not save all the data on the sheet if it is not closed at the end.
workbook.close()

### 2018

In [None]:
# Basic File Setting
workbook = xlsxwriter.Workbook('India2018.xlsx')
worksheet = workbook.add_worksheet()
base_url = "https://www.bseindia.com"
corp = "corp-information/"
finan = "financials-results/"
row = 0
column = 0
company_links = []

# Data Fields(Column Names) that will be collected
content = ['헤브론스타국가코드','현지언어국가명','영문국가명','시간','대륙','GDP','인구','지역','기업식별코드','현지언어기업명','영문기업명','현지언어한줄소개내용','영문한줄소개내용','현지언어기업소개내용','영문기업소개내용','설립일자','법인등록번호','사업자등록번호','기업대표전화번호','대표팩스번호','대표이메일','기업홈페이지URL','페이스북URL','인스타그램URL','유튜브URL','링크드인URL','트위터핸들','현지언어기업주소','영문기업주소','현지언어기업상세주소','영문기업상세주소','기업우편번호','기업종업원','외감법인구분','기업연수','기업상태','현지언어담당자명','영문담당자명','현지언어직위명','영문직위명','담당자부서명','담당자전화번호','담당자팩스번호','담당자이메일','담당자이동전화번호','회계연도','유동자산금액','비유동자산금액','자산총계금액','유동부채금액','비유동부채금액','부채총계금액','자본총계금액','부채자본총계금액','매출액','매출원가금액','판매비관리비금액','영업이익손실금액','금융수익금액','금융비용금액','기타영업외수익금액','기타영업외비용금액','법인세차감전순이익','법인세비용','당기순이익','영업활동현금흐름금액','투자활동현금흐름금액','재무활동현금흐름금액','기초현금자산금액','기말현금자산금액','부채비율','영업이익율','매출액증가율','영업이익증가율','당기순이익 증가율','기업 CAGR','현지언어산업군명','영문산업군명','현지언어주요제품명내용','영문주요제품명내용','국가언어코드','현지언어언어명','영문언어명','주식시장코드','현지언어주식시장명','영문주식시장명','상장코드','상장일자','주가(일)','주가(1주)','주가(1개월)','주가(6개월)','주가(1년)','주가(3년)','주가(5년)','주가(10년)','거래량','시가총액','지점코드','지점명','주소','주소상세','우편번호','사업자등록번호','이벤트','통화구분코드','화폐단위명','담당자','소스','날짜']

# Put Column names on the sheet
for item in content :
    worksheet.write(row, column, item)
    column += 1
row += 1

# Get EURONEXT Paris Stock Website
driver = webdriver.Chrome(path)
driver.get('https://www.bseindia.com/corporates/List_Scrips.html')
time.sleep(4)
driver.find_element(By.XPATH, value='/html/body/div[4]/div[1]/div[2]/select').click()
time.sleep(1)
driver.find_element(By.XPATH, value='/html/body/div[4]/div[1]/div[2]/select/option[2]').click()
time.sleep(1)
driver.find_element(By.XPATH, value='/html/body/div[4]/div[1]/div[4]/select').click()
time.sleep(1)
driver.find_element(By.XPATH, value='/html/body/div[4]/div[1]/div[4]/select/option[2]').click()
time.sleep(1)
driver.find_element(By.XPATH, value='/html/body/div[4]/div[2]/div[5]/input').click()
time.sleep(10)
text = driver.page_source
time.sleep(1)

# Get each company's links from a tags
soup = bs4.BeautifulSoup(text,'html.parser')
maintable = soup.find('table',{"class":"mGrid ng-scope"})
time.sleep(1)
all_atag_maintable = maintable.find_all('a')
all_atag_maintableHead = all_atag_maintable

driver.find_element(By.XPATH, value='/html/body/div[4]/div[1]/div[4]/select').click()
time.sleep(1)
driver.find_element(By.XPATH, value='/html/body/div[4]/div[1]/div[4]/select/option[3]').click()
time.sleep(1)
driver.find_element(By.XPATH, value='/html/body/div[4]/div[2]/div[5]/input').click()
time.sleep(7)
text2 = driver.page_source
time.sleep(1)

# Get each company's links from a tags
soup2 = bs4.BeautifulSoup(text2,'html.parser')
maintable2 = soup2.find('table',{"class":"mGrid ng-scope"})
time.sleep(1)
all_atag_maintable2 = maintable2.find_all('a')
all_atag_maintableHead2 = all_atag_maintable2

driver.find_element(By.XPATH, value='/html/body/div[4]/div[1]/div[4]/select').click()
time.sleep(1)
driver.find_element(By.XPATH, value='/html/body/div[4]/div[1]/div[4]/select/option[4]').click()
time.sleep(1)
driver.find_element(By.XPATH, value='/html/body/div[4]/div[2]/div[5]/input').click()
time.sleep(7)
text3 = driver.page_source
time.sleep(1)

# Get each company's links from a tags
soup3 = bs4.BeautifulSoup(text3,'html.parser')
maintable3 = soup3.find('table',{"class":"mGrid ng-scope"})
time.sleep(1)
all_atag_maintable3 = maintable3.find_all('a')
all_atag_maintableHead3 = all_atag_maintable3

# Final Link Setting
for a in all_atag_maintableHead:
    company_link = a.attrs["href"]
    company_links.append(company_link)

# Final Link Setting
for a in all_atag_maintableHead2:
    company_link = a.attrs["href"]
    company_links.append(company_link)

# Final Link Setting
for a in all_atag_maintableHead3:
    company_link = a.attrs["href"]
    company_links.append(company_link)

# To check if all companies are selected
# print(len(company_links))

# Main Data Collection code
for sub in company_links:
    # Start from A column every time
    column = 0
    info = []
    try:
        # crawling environment setting
        driver.get(sub)
        time.sleep(2)
        # Company Basic Information Setting
        name = driver.find_element(By.XPATH, value='/html/body/div[1]/div[4]/div[5]/div[2]/div/div[1]/div[1]/div[1]/div[2]/div/h1').text
        try:
            status = driver.find_element(By.XPATH, value='/html/body/div[1]/div[4]/div[5]/div[4]/div[2]/div[5]/div/table/tbody/tr[1]/td[2]').text
        except:
            status = "Public"
        try:
            revenue = driver.find_element(By.XPATH, value='/html/body/div[1]/div[4]/div[5]/div[6]/div/div/ng-view/div[3]/div/div[2]/div/div/div/div[1]/div/div[1]/table/tbody[1]/tr[1]/td[4]').text + "Cr. INR"
        except:
            revenue = ""
        try:
            profit = driver.find_element(By.XPATH, value='/html/body/div[1]/div[4]/div[5]/div[6]/div/div/ng-view/div[3]/div/div[2]/div/div/div/div[1]/div/div[1]/table/tbody[2]/tr[1]/td[4]').text + "Cr. INR"
        except:
            profit = ""
        # Counry Information - Must be investigated individually
        info.append("IND") 
        info.append("India") 
        info.append("India") 
        info.append("UTC+05:30") 
        info.append("아시아") 
        info.append("11745000000000 USD") 
        info.append("1352642280") 
        info.append("남아시아")
        driver.get(sub + corp)
        time.sleep(2)
        # Company Basic Information
        try:
            info.append("HBR" + driver.find_element(By.XPATH, value='/html/body/div[1]/div[4]/div[5]/div[6]/div/div/ng-view/div/div[2]/div/div/table/tbody/tr/td/table/tbody/tr[1]/td[2]').text)   
        except:
            info.append("HBRIND" + name) 
        info.append(name)
        info.append(name)
        try: 
            industry1 = driver.find_element(By.XPATH, value='/html/body/div[1]/div[4]/div[5]/div[6]/div/div/ng-view/div/div[2]/div/div/table/tbody/tr/td/table/tbody/tr[3]/td[2]').text
            industry2 = driver.find_element(By.XPATH, value='/html/body/div[1]/div[4]/div[5]/div[6]/div/div/ng-view/div/div[2]/div/div/table/tbody/tr/td/table/tbody/tr[3]/td[2]').text
        except:
            industry1 = "public company"
            industry2 = "products and services"

        #Descripton, Contact, Address, Extra Information, Management Information
        try: 
            address = driver.find_element(By.XPATH, value='/html/body/div[1]/div[4]/div[5]/div[6]/div/div/ng-view/div/div[6]/div/div/table/tbody/tr/td/table/tbody/tr[1]/td').text
        except: 
            address = "India, South Asia, Asia"

        info.append(name + " (English: " + name + ")" + " is a public company that is listed on Bombay Stock Exchange.")
        info.append(name + " (English: " + name + ")" + " is a public company that is listed on Bombay Stock Exchange.")
        info.append(name + " (English: " + name + ")" + " is a "+ industry2 + " company that is listed on Bombay Stock Exchange. They are providing " + industry2  + " in " + address + ", and they are operated as a public company.")
        info.append(name + " (English: " + name + ")" + " is a "+ industry2 + " company that is listed on Bombay Stock Exchange. They are providing " + industry2  + " in " + address + ", and they are operated as a public company.")
        #설립일자
        try:
            date = driver.find_element(By.XPATH, value='/html/body/div[1]/div[4]/div[5]/div[6]/div/div/ng-view/div/div[2]/div/div/table/tbody/tr/td/table/tbody/tr[7]/td[2]').text
        except:
            date = ""
        try:
            info.append(date) 
        except:
            info.append("")
        try:
            info.append(driver.find_element(By.XPATH, value='/html/body/div[1]/div[4]/div[5]/div[6]/div/div/ng-view/div/div[2]/div/div/table/tbody/tr/td/table/tbody/tr[2]/td[2]').text) 
        except:
            info.append("")
        info.append("")

        #연락처
        try:
            info.append(driver.find_element(By.XPATH, value='/html/body/div[1]/div[4]/div[5]/div[6]/div/div/ng-view/div/div[6]/div/div/table/tbody/tr/td/table/tbody/tr[2]/td[2]').text) 
        except:
            info.append("")
        try:
            info.append(driver.find_element(By.XPATH, value='/html/body/div[1]/div[4]/div[5]/div[6]/div/div/ng-view/div/div[6]/div/div/table/tbody/tr/td/table/tbody/tr[3]/td[2]').text) 
        except:
            info.append("")
        try:
            info.append(driver.find_element(By.XPATH, value='/html/body/div[1]/div[4]/div[5]/div[6]/div/div/ng-view/div/div[6]/div/div/table/tbody/tr/td/table/tbody/tr[4]/td[2]').text) 
        except:
            info.append("")
        try:
            info.append(driver.find_element(By.XPATH, value='/html/body/div[1]/div[4]/div[5]/div[6]/div/div/ng-view/div/div[6]/div/div/table/tbody/tr/td/table/tbody/tr[5]/td[2]').text) 
        except:
            info.append(sub)
        info.append("")
        info.append("")
        info.append("")
        info.append("")
        info.append("")
        info.append(address)
        info.append(address)
        info.append(address)
        info.append(address)
        info.append(address) 

        #종업원수
        info.append("")
        info.append("")
        info.append("")

        #기업상태
        info.append(status)
        #현지언어담당자명
        try: 
            info.append(driver.find_element(By.XPATH, value='/html/body/div[1]/div[4]/div[5]/div[6]/div/div/ng-view/div/div[3]/div/div/table/tbody/tr/td/table/tbody/tr[2]/td[1]').text)
        except:
            info.append("")
        try:
            info.append(driver.find_element(By.XPATH, value='/html/body/div[1]/div[4]/div[5]/div[6]/div/div/ng-view/div/div[3]/div/div/table/tbody/tr/td/table/tbody/tr[2]/td[1]').text)
        except:
            info.append("")
        #직위 및 부서
        try:
            info.append(driver.find_element(By.XPATH, value='/html/body/div[1]/div[4]/div[5]/div[6]/div/div/ng-view/div/div[3]/div/div/table/tbody/tr/td/table/tbody/tr[2]/td[2]').text)
        except:
            info.append("Key Executive")
        try:
            info.append(driver.find_element(By.XPATH, value='/html/body/div[1]/div[4]/div[5]/div[6]/div/div/ng-view/div/div[3]/div/div/table/tbody/tr/td/table/tbody/tr[2]/td[2]').text)
        except:
            info.append("Key Executive")
        info.append("Board of Directors")

        #담당자연락처
        try :
            info.append(driver.find_element(By.XPATH, value='/html/body/div[1]/div[4]/div[5]/div[6]/div/div/ng-view/div/div[6]/div/div/table/tbody/tr/td/table/tbody/tr[2]/td[2]').text) 
        except:
            info.append("")
        try :
            info.append(driver.find_element(By.XPATH, value='/html/body/div[1]/div[4]/div[5]/div[6]/div/div/ng-view/div/div[6]/div/div/table/tbody/tr/td/table/tbody/tr[3]/td[2]').text) 
        except:
            info.append("")
        try :
            info.append(driver.find_element(By.XPATH, value='/html/body/div[1]/div[4]/div[5]/div[6]/div/div/ng-view/div/div[6]/div/div/table/tbody/tr/td/table/tbody/tr[4]/td[2]').text) 
        except:
            info.append("")
        try :
            info.append(driver.find_element(By.XPATH, value='/html/body/div[1]/div[4]/div[5]/div[6]/div/div/ng-view/div/div[6]/div/div/table/tbody/tr/td/table/tbody/tr[2]/td[2]').text) 
        except:
            info.append("")

        # Financial Information
        driver.get(sub + finan)
        time.sleep(2)
        try:
            driver.find_element(By.XPATH, value='/html/body/div/div[4]/div[5]/div[6]/div/div/ng-view/div[2]/div/div/div/div[2]/ul/li[2]/a').click()
            time.sleep(1)
            #회계연도
            try:
                info.append(driver.find_element(By.XPATH, value='/html/body/div/div[4]/div[5]/div[6]/div/div/ng-view/div[2]/div/div/div/div[2]/div/div[2]/table/tbody/tr/td/table[1]/thead/tr/td[5]').text) 
            except:
                try:
                    info.append(driver.find_element(By.XPATH, value='/html/body/div/div[4]/div[5]/div[6]/div/div/ng-view/div[2]/div/div/div/div[2]/div/div[2]/table/tbody/tr/td/table[1]/thead/tr/td[3]').text) 
                except:
                    info.append("TTM")
            #유동자산금액
            info.append("")
            #비유동자산금액
            info.append("")
            #자산총계금액
            info.append("")
            #유동부채금액
            info.append("")
            #비유동부채금액
            info.append("")
            #부채총계금액
            info.append("")
            #자본총계금액
            try:
                info.append(driver.find_element(By.XPATH, value='/html/body/div/div[4]/div[5]/div[6]/div/div/ng-view/div[2]/div/div/div/div[2]/div/div[2]/table/tbody/tr/td/table[1]/tbody/tr[12]/td[5]').text + "Cr. INR")
            except:
                try:
                    info.append(driver.find_element(By.XPATH, value='/html/body/div/div[4]/div[5]/div[6]/div/div/ng-view/div[2]/div/div/div/div[2]/div/div[2]/table/tbody/tr/td/table[1]/tbody/tr[12]/td[3]').text + "Cr. INR")
                except:
                    info.append("")
            #자본부채총계금액
            info.append("")

            #매출액
            try:
                info.append(driver.find_element(By.XPATH, value='/html/body/div/div[4]/div[5]/div[6]/div/div/ng-view/div[2]/div/div/div/div[2]/div/div[2]/table/tbody/tr/td/table[1]/tbody/tr[2]/td[5]').text + "Cr. INR")
            except:
                try:
                    info.append(driver.find_element(By.XPATH, value='/html/body/div/div[4]/div[5]/div[6]/div/div/ng-view/div[2]/div/div/div/div[2]/div/div[2]/table/tbody/tr/td/table[1]/tbody/tr[2]/td[4]').text + "Cr. INR")
                except:
                    info.append(revenue)
            #매출원가금액
            try:
                info.append(driver.find_element(By.XPATH, value='/html/body/div/div[4]/div[5]/div[6]/div/div/ng-view/div[2]/div/div/div/div[2]/div/div[2]/table/tbody/tr/td/table[1]/tbody/tr[5]/td[5]').text + "Cr. INR")
            except:
                try:
                    info.append(driver.find_element(By.XPATH, value='/html/body/div/div[4]/div[5]/div[6]/div/div/ng-view/div[2]/div/div/div/div[2]/div/div[2]/table/tbody/tr/td/table[1]/tbody/tr[5]/td[3]').text + "Cr. INR")
                except:
                    info.append("") 
            #판매비관리비금액
            info.append("")
            #영업이익손실금액
            try:
                info.append(driver.find_element(By.XPATH, value='/html/body/div/div[4]/div[5]/div[6]/div/div/ng-view/div[2]/div/div/div/div[2]/div/div[2]/table/tbody/tr/td/table[1]/tbody/tr[7]/td[5]').text + "Cr. INR")
            except:
                try:
                    info.append(driver.find_element(By.XPATH, value='/html/body/div/div[4]/div[5]/div[6]/div/div/ng-view/div[2]/div/div/div/div[2]/div/div[2]/table/tbody/tr/td/table[1]/tbody/tr[7]/td[3]').text + "Cr. INR")
                except:
                    info.append("")
            #금융수익금액
            try:
                info.append(driver.find_element(By.XPATH, value='/html/body/div/div[4]/div[5]/div[6]/div/div/ng-view/div[2]/div/div/div/div[2]/div/div[2]/table/tbody/tr/td/table[1]/tbody/tr[6]/td[5]').text + "Cr. INR")
            except:
                try:
                    info.append(driver.find_element(By.XPATH, value='/html/body/div/div[4]/div[5]/div[6]/div/div/ng-view/div[2]/div/div/div/div[2]/div/div[2]/table/tbody/tr/td/table[1]/tbody/tr[6]/td[3]').text + "Cr. INR")
                except:
                    info.append("")
            #금융비용금액
            info.append("")
            #기타영업외수익금액
            try:
                info.append(driver.find_element(By.XPATH, value='/html/body/div/div[4]/div[5]/div[6]/div/div/ng-view/div[2]/div/div/div/div[2]/div/div[2]/table/tbody/tr/td/table[1]/tbody/tr[3]/td[5]').text + "Cr. INR")
            except:
                try:
                    info.append(driver.find_element(By.XPATH, value='/html/body/div/div[4]/div[5]/div[6]/div/div/ng-view/div[2]/div/div/div/div[2]/div/div[2]/table/tbody/tr/td/table[1]/tbody/tr[3]/td[3]').text + "Cr. INR")
                except:
                    info.append("") 
            #기타영업외비용금액
            info.append("")

            #법인세차감전순이익
            try:
                info.append(driver.find_element(By.XPATH, value='/html/body/div/div[4]/div[5]/div[6]/div/div/ng-view/div[2]/div/div/div/div[2]/div/div[2]/table/tbody/tr/td/table[1]/tbody/tr[9]/td[5]').text + "Cr. INR")
            except:
                try:
                    info.append(driver.find_element(By.XPATH, value='/html/body/div/div[4]/div[5]/div[6]/div/div/ng-view/div[2]/div/div/div/div[2]/div/div[2]/table/tbody/tr/td/table[1]/tbody/tr[9]/td[3]').text + "Cr. INR")
                except:
                    info.append("")
            #법인세비용
            try:
                info.append(driver.find_element(By.XPATH, value='/html/body/div/div[4]/div[5]/div[6]/div/div/ng-view/div[2]/div/div/div/div[2]/div/div[2]/table/tbody/tr/td/table[1]/tbody/tr[10]/td[5]').text + "Cr. INR")
            except:
                try:
                    info.append(driver.find_element(By.XPATH, value='/html/body/div/div[4]/div[5]/div[6]/div/div/ng-view/div[2]/div/div/div/div[2]/div/div[2]/table/tbody/tr/td/table[1]/tbody/tr[10]/td[3]').text + "Cr. INR")
                except:
                    info.append("")
            #당기순이익
            try:
                info.append(driver.find_element(By.XPATH, value='/html/body/div/div[4]/div[5]/div[6]/div/div/ng-view/div[2]/div/div/div/div[2]/div/div[2]/table/tbody/tr/td/table[1]/tbody/tr[11]/td[5]').text + "Cr. INR")
            except:
                try:
                    info.append(driver.find_element(By.XPATH, value='/html/body/div/div[4]/div[5]/div[6]/div/div/ng-view/div[2]/div/div/div/div[2]/div/div[2]/table/tbody/tr/td/table[1]/tbody/tr[11]/td[3]').text + "Cr. INR")
                except:
                    info.append(profit)

            #현금흐름
            #영업
            info.append("")
            #투자
            info.append("")
            #재무
            info.append("")
            #기초기말
            info.append("")
            info.append("")
            #재무정보중 계산예정
            info.append("")
            info.append("")
            info.append("")
            info.append("")
            info.append("")
            info.append("")

            # Industry and Products      
            info.append(industry1)
            info.append(industry1)
            info.append(industry2)
            info.append(industry2)


            # Language Information
            info.append("ENG")
            info.append("English")
            info.append("English")
        except:
            # Financial Information - Default
            #회계연도
            info.append("TTM")
            #유동자산금액
            info.append("")
            #비유동자산금액
            info.append("")
            #자산총계금액
            info.append("")
            #유동부채금액
            info.append("")
            #비유동부채금액
            info.append("")
            #부채총계금액
            info.append("")
            #자본총계금액
            info.append("")
            #자본부채총계금액
            info.append("")

            #매출액
            info.append(revenue)
            #매출원가금액
            info.append("") 
            #판매비관리비금액
            info.append("")
            #영업이익손실금액
            info.append("")
            #금융수익금액
            info.append("")
            #금융비용금액
            info.append("")
            #기타영업외수익금액
            info.append("") 
            #기타영업외비용금액
            info.append("")
            #법인세차감전순이익
            info.append("")
            #법인세비용
            info.append("")
            #당기순이익
            info.append(profit)

            #현금흐름
            #영업
            info.append("")
            #투자
            info.append("")
            #재무
            info.append("")
            #기초기말
            info.append("")
            info.append("")
            #재무정보중 계산예정
            info.append("")
            info.append("")
            info.append("")
            info.append("")
            info.append("")
            info.append("")

            # Industry and Products       
            info.append(industry1)
            info.append(industry1)
            info.append(industry2)
            info.append(industry2)


            # Language Information
            info.append("ENG")
            info.append("English")
            info.append("English")

        # Stock Market Information
        driver.get(sub)
        time.sleep(2)
        info.append("BSE")
        info.append("Bombay Stock Exchange")
        info.append("Bombay Stock Exchange")
        try:
            info.append(driver.find_element(By.XPATH, value='/html/body/div[1]/div[4]/div[5]/div[6]/div/div/ng-view/div[4]/div/div[2]/div/div/table/tbody[1]/tr[2]/th[2]/strong').text)
        except:
            try:
                info.append(driver.find_element(By.XPATH, value='/html/body/div[1]/div[4]/div[5]/div[2]/div/div[1]/div[1]/div[1]/div[2]/div/div[2]').text)
            except:
                info.append("")
        #상장일자
        info.append(date)

        #주가
        try:
            info.append(driver.find_element(By.XPATH, value='/html/body/div[1]/div[4]/div[5]/div[4]/div[2]/div[1]/div/table/tbody/tr[1]/td[2]').text + "INR")
        except:
            info.append("")
        info.append("")
        info.append("")
        info.append("")
        try:
            info.append(driver.find_element(By.XPATH, value='/html/body/div[1]/div[4]/div[5]/div[4]/div[2]/div[2]/div/table/tbody/tr[1]/td[2]').text + "INR")
        except:
            info.append("")
        info.append("")
        info.append("")
        info.append("")

        #거래량
        info.append("")
        #시가총액
        try:
            info.append(driver.find_element(By.XPATH, value='/html/body/div[1]/div[4]/div[5]/div[4]/div[2]/div[3]/div/table/tbody/tr[5]/td[2]').text + "Cr. INR")
        except:
            info.append("")

        #지점
        info.append("")
        info.append("")
        info.append("India")
        info.append("India, Asia")
        info.append("")
        info.append("")

        # Event
        try:
            info.append(driver.find_element(By.XPATH, value='/html/body/div[1]/div[4]/div[5]/div[6]/div/div/ng-view/div[3]/div/div[1]/div/div/div/div[1]/table/tbody[1]/tr[1]/td[1]/a').text)
        except:
            info.append("")

        # Currency Information
        info.append("INR")
        info.append("Indian Rupee")

        # Management
        info.append("Chris")
        info.append("bseindia.com")
        info.append("2022-07-14")
        
        # Status Check
        print(row)
        print(info)  
        
        # Adding Collected Data
        for item in info :
            worksheet.write(row, column, item)
            column += 1
        row += 1
    except:
        try:
            # crawling environment setting
            print(row)
            info = []
            driver.get(sub)
            time.sleep(2)
            # Company Basic Information
            name = driver.find_element(By.XPATH, value='/html/body/div[1]/div[6]/div/div[3]/div/div[1]/div[1]/div[1]/div[2]/div/h1').text
            status = "Public"
            revenue = ""
            profit = ""
            
            # Counry Information - Must be investigated individually
            info.append("IND") 
            info.append("India") 
            info.append("India") 
            info.append("UTC+05:30") 
            info.append("아시아") 
            info.append("11745000000000 USD") 
            info.append("1352642280") 
            info.append("남아시아")
            
            # Company Code/Name
            info.append("HBRIND" + name) 
            info.append(name)
            info.append(name)
            industry1 = "public company"
            industry2 = "products and services"

            # Industry and Products - Default
            address = "India, South Asia, Asia"

            info.append(name + " (English: " + name + ")" + " is a public company that is listed on Bombay Stock Exchange.")
            info.append(name + " (English: " + name + ")" + " is a public company that is listed on Bombay Stock Exchange.")
            info.append(name + " (English: " + name + ")" + " is a "+ industry2 + " company that is listed on Bombay Stock Exchange. They are providing " + industry2  + " in " + address + ", and they are operated as a public company.")
            info.append(name + " (English: " + name + ")" + " is a "+ industry2 + " company that is listed on Bombay Stock Exchange. They are providing " + industry2  + " in " + address + ", and they are operated as a public company.")
            #설립일자
            date = ""
            info.append("")
            info.append("")
            info.append("")

            #연락처
            info.append("")
            info.append("")
            info.append("")
            info.append(sub)
            info.append("")
            info.append("")
            info.append("")
            info.append("")
            info.append("")
            info.append(address)
            info.append(address)
            info.append(address)
            info.append(address)
            info.append(address) 

            #종업원수
            info.append("")
            info.append("")
            info.append("")

            #기업상태
            info.append(status)
            #현지언어담당자명
            info.append("")
            info.append("")
            #직위 및 부서
            info.append("Key Executive")
            info.append("Key Executive")
            info.append("Board of Directors")

            #담당자연락처
            info.append("")
            info.append("")
            info.append("")
            info.append("")

            #재무정보

            #회계연도
            info.append("TTM")
            #유동자산금액
            info.append("")
            #비유동자산금액
            info.append("")
            #자산총계금액
            info.append("")
            #유동부채금액
            info.append("")
            #비유동부채금액
            info.append("")
            #부채총계금액
            info.append("")
            #자본총계금액
            info.append("")
            #자본부채총계금액
            info.append("")

            #매출액
            info.append(revenue)
            #매출원가금액
            info.append("") 
            #판매비관리비금액
            info.append("")
            #영업이익손실금액
            info.append("")
            #금융수익금액
            info.append("")
            #금융비용금액
            info.append("")
            #기타영업외수익금액
            info.append("") 
            #기타영업외비용금액
            info.append("")

            #법인세차감전순이익
            info.append("")
            #법인세비용
            info.append("")
            #당기순이익
            info.append(profit)

            #현금흐름
            #영업
            info.append("")
            #투자
            info.append("")
            #재무
            info.append("")
            #기초기말
            info.append("")
            info.append("")
            #재무정보중 계산예정
            info.append("")
            info.append("")
            info.append("")
            info.append("")
            info.append("")
            info.append("")

            #산업군       
            info.append(industry1)
            info.append(industry1)
            info.append(industry2)
            info.append(industry2)

            # Language Information
            info.append("ENG")
            info.append("English")
            info.append("English")

            # Stock Market Information - Default
            info.append("BSE")
            info.append("Bombay Stock Exchange")
            info.append("Bombay Stock Exchange")
            info.append("")
            
            #상장일자
            info.append(date)

            #주가
            info.append("")
            info.append("")
            info.append("")
            info.append("")
            info.append("")
            info.append("")
            info.append("")
            info.append("")

            #거래량
            info.append("")
            #시가총액
            info.append("")

            #지점
            info.append("")
            info.append("")
            info.append("India")
            info.append("India, Asia")
            info.append("")
            info.append("")

            # Event - Default
            info.append("")

            # Currency Information
            info.append("INR")
            info.append("Indian Rupee")

             # For Status Checking
            info.append("Chris")
            info.append("bseindia.com")
            info.append("2022-07-14")
            
            #Adding Collected Data
            print(info)
            for item in info :
                worksheet.write(row, column, item)
                column += 1
            row += 1
            
        # If not possible, add link to the failList
        except: 
            failList.append(sub)
            
# Check failed result 
# - There are many ways to deal with failList: retry, ignore, manually add, find why error has occured.
# - This will be up to user's choice.
print("failLength:")
print(len(failList))
print(failList)

# Closing workbook is necessary - it will not save all the data on the sheet if it is not closed at the end.
workbook.close()

### 2017

In [None]:
# Basic File Setting
workbook = xlsxwriter.Workbook('India2017.xlsx')
worksheet = workbook.add_worksheet()
base_url = "https://www.bseindia.com"
corp = "corp-information/"
finan = "financials-results/"
row = 0
column = 0
company_links = []

# Data Fields(Column Names) that will be collected
content = ['헤브론스타국가코드','현지언어국가명','영문국가명','시간','대륙','GDP','인구','지역','기업식별코드','현지언어기업명','영문기업명','현지언어한줄소개내용','영문한줄소개내용','현지언어기업소개내용','영문기업소개내용','설립일자','법인등록번호','사업자등록번호','기업대표전화번호','대표팩스번호','대표이메일','기업홈페이지URL','페이스북URL','인스타그램URL','유튜브URL','링크드인URL','트위터핸들','현지언어기업주소','영문기업주소','현지언어기업상세주소','영문기업상세주소','기업우편번호','기업종업원','외감법인구분','기업연수','기업상태','현지언어담당자명','영문담당자명','현지언어직위명','영문직위명','담당자부서명','담당자전화번호','담당자팩스번호','담당자이메일','담당자이동전화번호','회계연도','유동자산금액','비유동자산금액','자산총계금액','유동부채금액','비유동부채금액','부채총계금액','자본총계금액','부채자본총계금액','매출액','매출원가금액','판매비관리비금액','영업이익손실금액','금융수익금액','금융비용금액','기타영업외수익금액','기타영업외비용금액','법인세차감전순이익','법인세비용','당기순이익','영업활동현금흐름금액','투자활동현금흐름금액','재무활동현금흐름금액','기초현금자산금액','기말현금자산금액','부채비율','영업이익율','매출액증가율','영업이익증가율','당기순이익 증가율','기업 CAGR','현지언어산업군명','영문산업군명','현지언어주요제품명내용','영문주요제품명내용','국가언어코드','현지언어언어명','영문언어명','주식시장코드','현지언어주식시장명','영문주식시장명','상장코드','상장일자','주가(일)','주가(1주)','주가(1개월)','주가(6개월)','주가(1년)','주가(3년)','주가(5년)','주가(10년)','거래량','시가총액','지점코드','지점명','주소','주소상세','우편번호','사업자등록번호','이벤트','통화구분코드','화폐단위명','담당자','소스','날짜']

# Put Column names on the sheet
for item in content :
    worksheet.write(row, column, item)
    column += 1
row += 1

# Get EURONEXT Paris Stock Website
driver = webdriver.Chrome(path)
driver.get('https://www.bseindia.com/corporates/List_Scrips.html')
time.sleep(4)
driver.find_element(By.XPATH, value='/html/body/div[4]/div[1]/div[2]/select').click()
time.sleep(1)
driver.find_element(By.XPATH, value='/html/body/div[4]/div[1]/div[2]/select/option[2]').click()
time.sleep(1)
driver.find_element(By.XPATH, value='/html/body/div[4]/div[1]/div[4]/select').click()
time.sleep(1)
driver.find_element(By.XPATH, value='/html/body/div[4]/div[1]/div[4]/select/option[2]').click()
time.sleep(1)
driver.find_element(By.XPATH, value='/html/body/div[4]/div[2]/div[5]/input').click()
time.sleep(10)
text = driver.page_source
time.sleep(1)

# Get each company's links from a tags
soup = bs4.BeautifulSoup(text,'html.parser')
maintable = soup.find('table',{"class":"mGrid ng-scope"})
time.sleep(1)
all_atag_maintable = maintable.find_all('a')
all_atag_maintableHead = all_atag_maintable

driver.find_element(By.XPATH, value='/html/body/div[4]/div[1]/div[4]/select').click()
time.sleep(1)
driver.find_element(By.XPATH, value='/html/body/div[4]/div[1]/div[4]/select/option[3]').click()
time.sleep(1)
driver.find_element(By.XPATH, value='/html/body/div[4]/div[2]/div[5]/input').click()
time.sleep(7)
text2 = driver.page_source
time.sleep(1)

# Get each company's links from a tags
soup2 = bs4.BeautifulSoup(text2,'html.parser')
maintable2 = soup2.find('table',{"class":"mGrid ng-scope"})
time.sleep(1)
all_atag_maintable2 = maintable2.find_all('a')
all_atag_maintableHead2 = all_atag_maintable2

driver.find_element(By.XPATH, value='/html/body/div[4]/div[1]/div[4]/select').click()
time.sleep(1)
driver.find_element(By.XPATH, value='/html/body/div[4]/div[1]/div[4]/select/option[4]').click()
time.sleep(1)
driver.find_element(By.XPATH, value='/html/body/div[4]/div[2]/div[5]/input').click()
time.sleep(7)
text3 = driver.page_source
time.sleep(1)

# Get each company's links from a tags
soup3 = bs4.BeautifulSoup(text3,'html.parser')
maintable3 = soup3.find('table',{"class":"mGrid ng-scope"})
time.sleep(1)
all_atag_maintable3 = maintable3.find_all('a')
all_atag_maintableHead3 = all_atag_maintable3

# Final Link Setting
for a in all_atag_maintableHead:
    company_link = a.attrs["href"]
    company_links.append(company_link)

# Final Link Setting
for a in all_atag_maintableHead2:
    company_link = a.attrs["href"]
    company_links.append(company_link)

# Final Link Setting
for a in all_atag_maintableHead3:
    company_link = a.attrs["href"]
    company_links.append(company_link)

# To check if all companies are selected
# print(len(company_links))
# Main Data Collection code
for sub in company_links:
    # Start from A column every time
    column = 0
    info = []
    try:
        # crawling environment setting
        driver.get(sub)
        time.sleep(2)
        # Company Basic Information Setting
        name = driver.find_element(By.XPATH, value='/html/body/div[1]/div[4]/div[5]/div[2]/div/div[1]/div[1]/div[1]/div[2]/div/h1').text
        try:
            status = driver.find_element(By.XPATH, value='/html/body/div[1]/div[4]/div[5]/div[4]/div[2]/div[5]/div/table/tbody/tr[1]/td[2]').text
        except:
            status = "Public"
        try:
            revenue = driver.find_element(By.XPATH, value='/html/body/div[1]/div[4]/div[5]/div[6]/div/div/ng-view/div[3]/div/div[2]/div/div/div/div[1]/div/div[1]/table/tbody[1]/tr[1]/td[4]').text + "Cr. INR"
        except:
            revenue = ""
        try:
            profit = driver.find_element(By.XPATH, value='/html/body/div[1]/div[4]/div[5]/div[6]/div/div/ng-view/div[3]/div/div[2]/div/div/div/div[1]/div/div[1]/table/tbody[2]/tr[1]/td[4]').text + "Cr. INR"
        except:
            profit = ""
        # Counry Information - Must be investigated individually
        info.append("IND") 
        info.append("India") 
        info.append("India") 
        info.append("UTC+05:30") 
        info.append("아시아") 
        info.append("11745000000000 USD") 
        info.append("1352642280") 
        info.append("남아시아")
        driver.get(sub + corp)
        time.sleep(2)
        # Company Basic Information
        try:
            info.append("HBR" + driver.find_element(By.XPATH, value='/html/body/div[1]/div[4]/div[5]/div[6]/div/div/ng-view/div/div[2]/div/div/table/tbody/tr/td/table/tbody/tr[1]/td[2]').text)   
        except:
            info.append("HBRIND" + name) 
        info.append(name)
        info.append(name)
        try: 
            industry1 = driver.find_element(By.XPATH, value='/html/body/div[1]/div[4]/div[5]/div[6]/div/div/ng-view/div/div[2]/div/div/table/tbody/tr/td/table/tbody/tr[3]/td[2]').text
            industry2 = driver.find_element(By.XPATH, value='/html/body/div[1]/div[4]/div[5]/div[6]/div/div/ng-view/div/div[2]/div/div/table/tbody/tr/td/table/tbody/tr[3]/td[2]').text
        except:
            industry1 = "public company"
            industry2 = "products and services"

        #Descripton, Contact, Address, Extra Information, Management Information
        try: 
            address = driver.find_element(By.XPATH, value='/html/body/div[1]/div[4]/div[5]/div[6]/div/div/ng-view/div/div[6]/div/div/table/tbody/tr/td/table/tbody/tr[1]/td').text
        except: 
            address = "India, South Asia, Asia"

        info.append(name + " (English: " + name + ")" + " is a public company that is listed on Bombay Stock Exchange.")
        info.append(name + " (English: " + name + ")" + " is a public company that is listed on Bombay Stock Exchange.")
        info.append(name + " (English: " + name + ")" + " is a "+ industry2 + " company that is listed on Bombay Stock Exchange. They are providing " + industry2  + " in " + address + ", and they are operated as a public company.")
        info.append(name + " (English: " + name + ")" + " is a "+ industry2 + " company that is listed on Bombay Stock Exchange. They are providing " + industry2  + " in " + address + ", and they are operated as a public company.")
        #설립일자
        try:
            date = driver.find_element(By.XPATH, value='/html/body/div[1]/div[4]/div[5]/div[6]/div/div/ng-view/div/div[2]/div/div/table/tbody/tr/td/table/tbody/tr[7]/td[2]').text
        except:
            date = ""
        try:
            info.append(date) 
        except:
            info.append("")
        try:
            info.append(driver.find_element(By.XPATH, value='/html/body/div[1]/div[4]/div[5]/div[6]/div/div/ng-view/div/div[2]/div/div/table/tbody/tr/td/table/tbody/tr[2]/td[2]').text) 
        except:
            info.append("")
        info.append("")

        #연락처
        try:
            info.append(driver.find_element(By.XPATH, value='/html/body/div[1]/div[4]/div[5]/div[6]/div/div/ng-view/div/div[6]/div/div/table/tbody/tr/td/table/tbody/tr[2]/td[2]').text) 
        except:
            info.append("")
        try:
            info.append(driver.find_element(By.XPATH, value='/html/body/div[1]/div[4]/div[5]/div[6]/div/div/ng-view/div/div[6]/div/div/table/tbody/tr/td/table/tbody/tr[3]/td[2]').text) 
        except:
            info.append("")
        try:
            info.append(driver.find_element(By.XPATH, value='/html/body/div[1]/div[4]/div[5]/div[6]/div/div/ng-view/div/div[6]/div/div/table/tbody/tr/td/table/tbody/tr[4]/td[2]').text) 
        except:
            info.append("")
        try:
            info.append(driver.find_element(By.XPATH, value='/html/body/div[1]/div[4]/div[5]/div[6]/div/div/ng-view/div/div[6]/div/div/table/tbody/tr/td/table/tbody/tr[5]/td[2]').text) 
        except:
            info.append(sub)
        info.append("")
        info.append("")
        info.append("")
        info.append("")
        info.append("")
        info.append(address)
        info.append(address)
        info.append(address)
        info.append(address)
        info.append(address) 

        #종업원수
        info.append("")
        info.append("")
        info.append("")

        #기업상태
        info.append(status)
        #현지언어담당자명
        try: 
            info.append(driver.find_element(By.XPATH, value='/html/body/div[1]/div[4]/div[5]/div[6]/div/div/ng-view/div/div[3]/div/div/table/tbody/tr/td/table/tbody/tr[2]/td[1]').text)
        except:
            info.append("")
        try:
            info.append(driver.find_element(By.XPATH, value='/html/body/div[1]/div[4]/div[5]/div[6]/div/div/ng-view/div/div[3]/div/div/table/tbody/tr/td/table/tbody/tr[2]/td[1]').text)
        except:
            info.append("")
        #직위 및 부서
        try:
            info.append(driver.find_element(By.XPATH, value='/html/body/div[1]/div[4]/div[5]/div[6]/div/div/ng-view/div/div[3]/div/div/table/tbody/tr/td/table/tbody/tr[2]/td[2]').text)
        except:
            info.append("Key Executive")
        try:
            info.append(driver.find_element(By.XPATH, value='/html/body/div[1]/div[4]/div[5]/div[6]/div/div/ng-view/div/div[3]/div/div/table/tbody/tr/td/table/tbody/tr[2]/td[2]').text)
        except:
            info.append("Key Executive")
        info.append("Board of Directors")

        #담당자연락처
        try :
            info.append(driver.find_element(By.XPATH, value='/html/body/div[1]/div[4]/div[5]/div[6]/div/div/ng-view/div/div[6]/div/div/table/tbody/tr/td/table/tbody/tr[2]/td[2]').text) 
        except:
            info.append("")
        try :
            info.append(driver.find_element(By.XPATH, value='/html/body/div[1]/div[4]/div[5]/div[6]/div/div/ng-view/div/div[6]/div/div/table/tbody/tr/td/table/tbody/tr[3]/td[2]').text) 
        except:
            info.append("")
        try :
            info.append(driver.find_element(By.XPATH, value='/html/body/div[1]/div[4]/div[5]/div[6]/div/div/ng-view/div/div[6]/div/div/table/tbody/tr/td/table/tbody/tr[4]/td[2]').text) 
        except:
            info.append("")
        try :
            info.append(driver.find_element(By.XPATH, value='/html/body/div[1]/div[4]/div[5]/div[6]/div/div/ng-view/div/div[6]/div/div/table/tbody/tr/td/table/tbody/tr[2]/td[2]').text) 
        except:
            info.append("")

        # Financial Information
        driver.get(sub + finan)
        time.sleep(2)
        try:
            driver.find_element(By.XPATH, value='/html/body/div/div[4]/div[5]/div[6]/div/div/ng-view/div[2]/div/div/div/div[2]/ul/li[2]/a').click()
            time.sleep(1)
            #회계연도
            try:
                info.append(driver.find_element(By.XPATH, value='/html/body/div/div[4]/div[5]/div[6]/div/div/ng-view/div[2]/div/div/div/div[2]/div/div[2]/table/tbody/tr/td/table[1]/thead/tr/td[6]').text) 
            except:
                try:
                    info.append(driver.find_element(By.XPATH, value='/html/body/div/div[4]/div[5]/div[6]/div/div/ng-view/div[2]/div/div/div/div[2]/div/div[2]/table/tbody/tr/td/table[1]/thead/tr/td[3]').text) 
                except:
                    info.append("TTM")
            #유동자산금액
            info.append("")
            #비유동자산금액
            info.append("")
            #자산총계금액
            info.append("")
            #유동부채금액
            info.append("")
            #비유동부채금액
            info.append("")
            #부채총계금액
            info.append("")
            #자본총계금액
            try:
                info.append(driver.find_element(By.XPATH, value='/html/body/div/div[4]/div[5]/div[6]/div/div/ng-view/div[2]/div/div/div/div[2]/div/div[2]/table/tbody/tr/td/table[1]/tbody/tr[12]/td[6]').text + "Cr. INR")
            except:
                try:
                    info.append(driver.find_element(By.XPATH, value='/html/body/div/div[4]/div[5]/div[6]/div/div/ng-view/div[2]/div/div/div/div[2]/div/div[2]/table/tbody/tr/td/table[1]/tbody/tr[12]/td[3]').text + "Cr. INR")
                except:
                    info.append("")
            #자본부채총계금액
            info.append("")

            #매출액
            try:
                info.append(driver.find_element(By.XPATH, value='/html/body/div/div[4]/div[5]/div[6]/div/div/ng-view/div[2]/div/div/div/div[2]/div/div[2]/table/tbody/tr/td/table[1]/tbody/tr[2]/td[6]').text + "Cr. INR")
            except:
                try:
                    info.append(driver.find_element(By.XPATH, value='/html/body/div/div[4]/div[5]/div[6]/div/div/ng-view/div[2]/div/div/div/div[2]/div/div[2]/table/tbody/tr/td/table[1]/tbody/tr[2]/td[4]').text + "Cr. INR")
                except:
                    info.append(revenue)
            #매출원가금액
            try:
                info.append(driver.find_element(By.XPATH, value='/html/body/div/div[4]/div[5]/div[6]/div/div/ng-view/div[2]/div/div/div/div[2]/div/div[2]/table/tbody/tr/td/table[1]/tbody/tr[5]/td[6]').text + "Cr. INR")
            except:
                try:
                    info.append(driver.find_element(By.XPATH, value='/html/body/div/div[4]/div[5]/div[6]/div/div/ng-view/div[2]/div/div/div/div[2]/div/div[2]/table/tbody/tr/td/table[1]/tbody/tr[5]/td[3]').text + "Cr. INR")
                except:
                    info.append("") 
            #판매비관리비금액
            info.append("")
            #영업이익손실금액
            try:
                info.append(driver.find_element(By.XPATH, value='/html/body/div/div[4]/div[5]/div[6]/div/div/ng-view/div[2]/div/div/div/div[2]/div/div[2]/table/tbody/tr/td/table[1]/tbody/tr[7]/td[6]').text + "Cr. INR")
            except:
                try:
                    info.append(driver.find_element(By.XPATH, value='/html/body/div/div[4]/div[5]/div[6]/div/div/ng-view/div[2]/div/div/div/div[2]/div/div[2]/table/tbody/tr/td/table[1]/tbody/tr[7]/td[3]').text + "Cr. INR")
                except:
                    info.append("")
            #금융수익금액
            try:
                info.append(driver.find_element(By.XPATH, value='/html/body/div/div[4]/div[5]/div[6]/div/div/ng-view/div[2]/div/div/div/div[2]/div/div[2]/table/tbody/tr/td/table[1]/tbody/tr[6]/td[6]').text + "Cr. INR")
            except:
                try:
                    info.append(driver.find_element(By.XPATH, value='/html/body/div/div[4]/div[5]/div[6]/div/div/ng-view/div[2]/div/div/div/div[2]/div/div[2]/table/tbody/tr/td/table[1]/tbody/tr[6]/td[3]').text + "Cr. INR")
                except:
                    info.append("")
            #금융비용금액
            info.append("")
            #기타영업외수익금액
            try:
                info.append(driver.find_element(By.XPATH, value='/html/body/div/div[4]/div[5]/div[6]/div/div/ng-view/div[2]/div/div/div/div[2]/div/div[2]/table/tbody/tr/td/table[1]/tbody/tr[3]/td[6]').text + "Cr. INR")
            except:
                try:
                    info.append(driver.find_element(By.XPATH, value='/html/body/div/div[4]/div[5]/div[6]/div/div/ng-view/div[2]/div/div/div/div[2]/div/div[2]/table/tbody/tr/td/table[1]/tbody/tr[3]/td[3]').text + "Cr. INR")
                except:
                    info.append("") 
            #기타영업외비용금액
            info.append("")

            #법인세차감전순이익
            try:
                info.append(driver.find_element(By.XPATH, value='/html/body/div/div[4]/div[5]/div[6]/div/div/ng-view/div[2]/div/div/div/div[2]/div/div[2]/table/tbody/tr/td/table[1]/tbody/tr[9]/td[6]').text + "Cr. INR")
            except:
                try:
                    info.append(driver.find_element(By.XPATH, value='/html/body/div/div[4]/div[5]/div[6]/div/div/ng-view/div[2]/div/div/div/div[2]/div/div[2]/table/tbody/tr/td/table[1]/tbody/tr[9]/td[3]').text + "Cr. INR")
                except:
                    info.append("")
            #법인세비용
            try:
                info.append(driver.find_element(By.XPATH, value='/html/body/div/div[4]/div[5]/div[6]/div/div/ng-view/div[2]/div/div/div/div[2]/div/div[2]/table/tbody/tr/td/table[1]/tbody/tr[10]/td[6]').text + "Cr. INR")
            except:
                try:
                    info.append(driver.find_element(By.XPATH, value='/html/body/div/div[4]/div[5]/div[6]/div/div/ng-view/div[2]/div/div/div/div[2]/div/div[2]/table/tbody/tr/td/table[1]/tbody/tr[10]/td[3]').text + "Cr. INR")
                except:
                    info.append("")
            #당기순이익
            try:
                info.append(driver.find_element(By.XPATH, value='/html/body/div/div[4]/div[5]/div[6]/div/div/ng-view/div[2]/div/div/div/div[2]/div/div[2]/table/tbody/tr/td/table[1]/tbody/tr[11]/td[6]').text + "Cr. INR")
            except:
                try:
                    info.append(driver.find_element(By.XPATH, value='/html/body/div/div[4]/div[5]/div[6]/div/div/ng-view/div[2]/div/div/div/div[2]/div/div[2]/table/tbody/tr/td/table[1]/tbody/tr[11]/td[3]').text + "Cr. INR")
                except:
                    info.append(profit)

            #현금흐름
            #영업
            info.append("")
            #투자
            info.append("")
            #재무
            info.append("")
            #기초기말
            info.append("")
            info.append("")
            #재무정보중 계산예정
            info.append("")
            info.append("")
            info.append("")
            info.append("")
            info.append("")
            info.append("")

            # Industry and Products      
            info.append(industry1)
            info.append(industry1)
            info.append(industry2)
            info.append(industry2)


            # Language Information
            info.append("ENG")
            info.append("English")
            info.append("English")
        except:
            # Financial Information - Default
            #회계연도
            info.append("TTM")
            #유동자산금액
            info.append("")
            #비유동자산금액
            info.append("")
            #자산총계금액
            info.append("")
            #유동부채금액
            info.append("")
            #비유동부채금액
            info.append("")
            #부채총계금액
            info.append("")
            #자본총계금액
            info.append("")
            #자본부채총계금액
            info.append("")

            #매출액
            info.append(revenue)
            #매출원가금액
            info.append("") 
            #판매비관리비금액
            info.append("")
            #영업이익손실금액
            info.append("")
            #금융수익금액
            info.append("")
            #금융비용금액
            info.append("")
            #기타영업외수익금액
            info.append("") 
            #기타영업외비용금액
            info.append("")
            #법인세차감전순이익
            info.append("")
            #법인세비용
            info.append("")
            #당기순이익
            info.append(profit)

            #현금흐름
            #영업
            info.append("")
            #투자
            info.append("")
            #재무
            info.append("")
            #기초기말
            info.append("")
            info.append("")
            #재무정보중 계산예정
            info.append("")
            info.append("")
            info.append("")
            info.append("")
            info.append("")
            info.append("")

            # Industry and Products       
            info.append(industry1)
            info.append(industry1)
            info.append(industry2)
            info.append(industry2)


            # Language Information
            info.append("ENG")
            info.append("English")
            info.append("English")

        # Stock Market Information
        driver.get(sub)
        time.sleep(2)
        info.append("BSE")
        info.append("Bombay Stock Exchange")
        info.append("Bombay Stock Exchange")
        try:
            info.append(driver.find_element(By.XPATH, value='/html/body/div[1]/div[4]/div[5]/div[6]/div/div/ng-view/div[4]/div/div[2]/div/div/table/tbody[1]/tr[2]/th[2]/strong').text)
        except:
            try:
                info.append(driver.find_element(By.XPATH, value='/html/body/div[1]/div[4]/div[5]/div[2]/div/div[1]/div[1]/div[1]/div[2]/div/div[2]').text)
            except:
                info.append("")
        #상장일자
        info.append(date)

        #주가
        try:
            info.append(driver.find_element(By.XPATH, value='/html/body/div[1]/div[4]/div[5]/div[4]/div[2]/div[1]/div/table/tbody/tr[1]/td[2]').text + "INR")
        except:
            info.append("")
        info.append("")
        info.append("")
        info.append("")
        try:
            info.append(driver.find_element(By.XPATH, value='/html/body/div[1]/div[4]/div[5]/div[4]/div[2]/div[2]/div/table/tbody/tr[1]/td[2]').text + "INR")
        except:
            info.append("")
        info.append("")
        info.append("")
        info.append("")

        #거래량
        info.append("")
        #시가총액
        try:
            info.append(driver.find_element(By.XPATH, value='/html/body/div[1]/div[4]/div[5]/div[4]/div[2]/div[3]/div/table/tbody/tr[5]/td[2]').text + "Cr. INR")
        except:
            info.append("")

        #지점
        info.append("")
        info.append("")
        info.append("India")
        info.append("India, Asia")
        info.append("")
        info.append("")

        # Event
        try:
            info.append(driver.find_element(By.XPATH, value='/html/body/div[1]/div[4]/div[5]/div[6]/div/div/ng-view/div[3]/div/div[1]/div/div/div/div[1]/table/tbody[1]/tr[1]/td[1]/a').text)
        except:
            info.append("")

        # Currency Information
        info.append("INR")
        info.append("Indian Rupee")

        # Management
        info.append("Chris")
        info.append("bseindia.com")
        info.append("2022-07-14")
        
        # Status Check
        print(row)
        print(info)  
        
        # Adding Collected Data
        for item in info :
            worksheet.write(row, column, item)
            column += 1
        row += 1
    except:
        try:
            # crawling environment setting
            print(row)
            info = []
            driver.get(sub)
            time.sleep(2)
            # Company Basic Information
            name = driver.find_element(By.XPATH, value='/html/body/div[1]/div[6]/div/div[3]/div/div[1]/div[1]/div[1]/div[2]/div/h1').text
            status = "Public"
            revenue = ""
            profit = ""
            
            # Counry Information - Must be investigated individually
            info.append("IND") 
            info.append("India") 
            info.append("India") 
            info.append("UTC+05:30") 
            info.append("아시아") 
            info.append("11745000000000 USD") 
            info.append("1352642280") 
            info.append("남아시아")
            
            # Company Code/Name
            info.append("HBRIND" + name) 
            info.append(name)
            info.append(name)
            industry1 = "public company"
            industry2 = "products and services"

            # Industry and Products - Default
            address = "India, South Asia, Asia"

            info.append(name + " (English: " + name + ")" + " is a public company that is listed on Bombay Stock Exchange.")
            info.append(name + " (English: " + name + ")" + " is a public company that is listed on Bombay Stock Exchange.")
            info.append(name + " (English: " + name + ")" + " is a "+ industry2 + " company that is listed on Bombay Stock Exchange. They are providing " + industry2  + " in " + address + ", and they are operated as a public company.")
            info.append(name + " (English: " + name + ")" + " is a "+ industry2 + " company that is listed on Bombay Stock Exchange. They are providing " + industry2  + " in " + address + ", and they are operated as a public company.")
            #설립일자
            date = ""
            info.append("")
            info.append("")
            info.append("")

            #연락처
            info.append("")
            info.append("")
            info.append("")
            info.append(sub)
            info.append("")
            info.append("")
            info.append("")
            info.append("")
            info.append("")
            info.append(address)
            info.append(address)
            info.append(address)
            info.append(address)
            info.append(address) 

            #종업원수
            info.append("")
            info.append("")
            info.append("")

            #기업상태
            info.append(status)
            #현지언어담당자명
            info.append("")
            info.append("")
            #직위 및 부서
            info.append("Key Executive")
            info.append("Key Executive")
            info.append("Board of Directors")

            #담당자연락처
            info.append("")
            info.append("")
            info.append("")
            info.append("")

            #재무정보

            #회계연도
            info.append("TTM")
            #유동자산금액
            info.append("")
            #비유동자산금액
            info.append("")
            #자산총계금액
            info.append("")
            #유동부채금액
            info.append("")
            #비유동부채금액
            info.append("")
            #부채총계금액
            info.append("")
            #자본총계금액
            info.append("")
            #자본부채총계금액
            info.append("")

            #매출액
            info.append(revenue)
            #매출원가금액
            info.append("") 
            #판매비관리비금액
            info.append("")
            #영업이익손실금액
            info.append("")
            #금융수익금액
            info.append("")
            #금융비용금액
            info.append("")
            #기타영업외수익금액
            info.append("") 
            #기타영업외비용금액
            info.append("")

            #법인세차감전순이익
            info.append("")
            #법인세비용
            info.append("")
            #당기순이익
            info.append(profit)

            #현금흐름
            #영업
            info.append("")
            #투자
            info.append("")
            #재무
            info.append("")
            #기초기말
            info.append("")
            info.append("")
            #재무정보중 계산예정
            info.append("")
            info.append("")
            info.append("")
            info.append("")
            info.append("")
            info.append("")

            #산업군       
            info.append(industry1)
            info.append(industry1)
            info.append(industry2)
            info.append(industry2)

            # Language Information
            info.append("ENG")
            info.append("English")
            info.append("English")

            # Stock Market Information - Default
            info.append("BSE")
            info.append("Bombay Stock Exchange")
            info.append("Bombay Stock Exchange")
            info.append("")
            
            #상장일자
            info.append(date)

            #주가
            info.append("")
            info.append("")
            info.append("")
            info.append("")
            info.append("")
            info.append("")
            info.append("")
            info.append("")

            #거래량
            info.append("")
            #시가총액
            info.append("")

            #지점
            info.append("")
            info.append("")
            info.append("India")
            info.append("India, Asia")
            info.append("")
            info.append("")

            # Event - Default
            info.append("")

            # Currency Information
            info.append("INR")
            info.append("Indian Rupee")

             # For Status Checking
            info.append("Chris")
            info.append("bseindia.com")
            info.append("2022-07-14")
            
            #Adding Collected Data
            print(info)
            for item in info :
                worksheet.write(row, column, item)
                column += 1
            row += 1
            
        # If not possible, add link to the failList
        except: 
            failList.append(sub)
            
# Check failed result 
# - There are many ways to deal with failList: retry, ignore, manually add, find why error has occured.
# - This will be up to user's choice.
print("failLength:")
print(len(failList))
print(failList)

# Closing workbook is necessary - it will not save all the data on the sheet if it is not closed at the end.
workbook.close()

## 5.FailList
#### - There are many ways to deal with elements failList: retry(if caused by server loading status), ignore(if caused by actually non-existing data), manually add(if only few exist), find why error has occurred(if massive).