# Belgian AI Landscape


### 1. Import Modules

In [7]:
import numpy as np
import time
from scrapeData import getData, createDriver, getAddress
from concurrent.futures import ThreadPoolExecutor

### 2. Scrape company information from AI4Belgium

In [None]:
# Create a DataFrame from the AI4Belgium website and save as a DataFrame
ai_df = getData('https://www.ai4belgium.be/ai-landscape/')

### 3. Search for the company name in Google search and Maps to obtain addresses

    Note: Here we use threading to speed-up our code.

In [10]:
start_time = time.process_time() 

#Build as many drivers as there are threads, so each thread gets own driver
with ThreadPoolExecutor(max_workers=10) as executor:
     result= [executor.submit(getAddress, name) for name in ai_df['Company Name']]

print(round(time.process_time() - start_time,2), "seconds" ) 

# Save addresses in a list using list-comprehension
addresses =  [item.result()[0] for item in result]

# Add the addresses as a new column in the DataFrame
ai_df['Address']  = addresses

# Save DataFrame as a CSV file
ai_df.to_csv('data/AILandscape_from_script.csv', index = False)

38.36 seconds


### 4. Check for Missing Values

    Currently, our script scrappes 312 addresses out of 437 (i.e. 125 missing). 

In [18]:
(ai_df['Address']=='').sum()

125

### 5. Next Steps

1. Automate filling in missing values. Idea: Check other websites for scraping.
2. Visualize gathered data in a map.
3. Create Streamlit App and deploy.