## NeuroFinder Processing Tool - Jupyter Notebook Version

Welcome to the NeuroFinder Processing Tool. This Jupyter Notebook provides a non-GUI interface to run the project and get familiar with its functionalities. You can process your data files, update databases, and generate reports directly within this notebook.

The NeuroFinder Processing Tool automates the management of a comprehensive database containing company information related to neurotechnology. It facilitates the import, standardization, validation, and updating of company data files in multiple formats (e.g., CSV, Excel).

### Objective of This Notebook

This notebook aims to:
* Provide an interactive environment to run the NeuroFinder Processing Tool without the GUI.
* Allow you to load data files, process them, and export the results.
* Help you get familiar with the tool's functionalities.

### Prerequisites

Before running this notebook, ensure you have:

* Python 3.x installed.
* Necessary Python packages (we will install them in the next step).
* Access to the data files you wish to process.
* The main database files (main_database.xlsx, not_neurotech_database.xlsx).

In [1]:
# Install required packages
!pip install pandas openpyxl requests python-dotenv matplotlib seaborn sqlite3


Collecting seaborn
  Using cached seaborn-0.13.2-py3-none-any.whl.metadata (5.4 kB)


ERROR: Could not find a version that satisfies the requirement sqlite3 (from versions: none)
ERROR: No matching distribution found for sqlite3


In [2]:
!python.exe -m pip install --upgrade pip




In [1]:
# Import standard libraries
import os
import re
import unicodedata
from datetime import datetime as dt

# Import third-party libraries
import pandas as pd
from dotenv import load_dotenv

import warnings
warnings.filterwarnings("ignore", category=UserWarning, module="openpyxl")


### Loading Environment Variables

If you have a .env file with environment variables, you can load it using python-dotenv. Otherwise, we can set default paths.

In [2]:
# Load environment variables
load_dotenv()
MAIN_DB_PATH = os.getenv('MAIN_DB_PATH')
NOT_NEUROTECH_DB_PATH = os.getenv('NOT_NEUROTECH_DB_PATH')
NEW_COMPANIES_PATH = os.getenv('NEW_COMPANIES_PATH')
UPDATED_COMPANIES_PATH = os.getenv('UPDATED_COMPANIES_PATH')


Defining Helper Functions

In [3]:
def clean_value(value):
    """Cleans the input value by stripping unwanted characters and converting to int if possible."""
    if pd.isna(value):
        return value
    cleaned_value = str(value).strip('="')
    try:
        return int(cleaned_value)
    except ValueError:
        return cleaned_value

def clean_dataframe(filepath, file_type='csv'):
    """Reads a file into a DataFrame, cleans it, and returns the cleaned DataFrame."""
    read_function = pd.read_csv if file_type == 'csv' else pd.read_excel
    df = read_function(filepath, index_col=False,
                       engine='openpyxl' if file_type == 'excel' else None)
    if 'former company names' in df.columns:
        df['former company names'] = df['former company names'].astype(str)
    for col in df.columns:
        df[col] = df[col].apply(clean_value)
    return df

def escape_special_characters(name: str) -> str:
    """Replaces special characters in a filename with underscores to ensure compatibility."""
    return re.sub(r'[^a-zA-Z0-9-_]', '_', name)


### Initializing the Database Handler

Create an instance of the DbHandler class to manage your databases.

In [4]:
from main.backend import DbHandler
# Initialize the database handler
db_handler = DbHandler(MAIN_DB_PATH, NOT_NEUROTECH_DB_PATH)

# Review the data

In [5]:
db_handler.main_db.describe()

Unnamed: 0,Company Founded Year,Last Funding Amount,Total Funding Amount,Number of Funding Rounds,Company Number of Investors,Company Number of Investments,acquired,Inactive Year,Number of Patents,Unnamed: 53,Contact Name
count,262.0,93.0,108.0,124.0,116.0,6.0,95.0,54.0,1.0,0.0,0.0
mean,2013.454198,9096734.0,25796170.0,2.548387,3.939655,3.0,0.105263,2019.222222,13.0,,
std,11.614324,18588340.0,69100250.0,2.123637,4.175058,1.264911,0.30852,3.451369,,,
min,1905.0,10000.0,16000.0,0.0,0.0,1.0,0.0,2007.0,13.0,,
25%,2011.0,850000.0,1310000.0,1.0,1.0,2.25,0.0,2017.0,13.0,,
50%,2016.0,2200000.0,4000000.0,2.0,2.0,3.5,0.0,2019.0,13.0,,
75%,2019.0,10000000.0,22117500.0,3.0,5.0,4.0,0.0,2022.0,13.0,,
max,2024.0,150000000.0,556900000.0,10.0,20.0,4.0,1.0,2024.0,13.0,,


In [6]:
db_handler.main_db.shape
# 659 companies X 58 columns (features) in the main database

(273, 61)

In [7]:
db_handler.main_db.head()

Unnamed: 0,Company Name,Updating_Date,Logo in Visualization folder?,"Operation Status (Active=True, False = False)",INCLUSION,Operation/relevant Notes,Website,Startup Nation Page,Neurotech Category,Market Category,...,Number of Patents,Comments,Unnamed: 53,Contact Name,Contact Phone Number / Email,האם יצרנו איתם כבר קשר? (כדי לא להתיש),BrainstormIL contact,Unnamed: 58,Unnamed: 59,Normalized_Company_Name
0,AcousticView,2024-02-14 00:00:00,yes,True,True,,http://www.acousticview.com/,https://finder.startupnationcentral.org/compan...,Imaging | Neuromonitoring,Medical devices | Medical equipment,...,,,,,,,,,,acousticview
1,ActualSignal,2024-07-14 00:00:00,No,True,True,,https://www.actualsignal.com/,https://finder.startupnationcentral.org/compan...,NeuroreHabilitation | NeuroDegenerative | Neur...,Digital & Health care,...,,,,,,,,,,actualsignal
2,Adam CogTech,2024-02-14 00:00:00,yes,True,True,website does work,http://adam-cogtec.com/,https://finder.startupnationcentral.org/compan...,Cognitive Assessment & Enhancement,Consumer Electronics,...,,,,,,,אסף הראל,,,adamcogtech
3,AlgoSensus,2024-02-14 00:00:00,yes,True,True,website does work,https://www.algosensus.com/,https://finder.startupnationcentral.org/compan...,Cognitive Assessment & Enhancement,Medical devices | Medical equipment,...,,,,,,,,,,algosensus
4,Alpha Omega,2024-02-14 00:00:00,yes,True,True,,http://www.alphaomega-eng.com,https://finder.startupnationcentral.org/compan...,NeuroSurgery | NeuroDevices,Medical devices | Medical equipment,...,13.0,,,,,,,,,alphaomega


## Review functions

In [8]:
# Lets check if a company is in the database or in the not neurotech database
company_name = 'Thrombotech Ltd'
in_main = db_handler.is_company_in_database(company_name, db_handler.main_db)
in_not_neuro_tech = db_handler.is_company_in_database(company_name, db_handler.not_neurotech_db)
if in_main:
    print(f'Company "{company_name}" is found in the main database.')
elif in_not_neuro_tech:
    print(f'Company "{company_name}" is found in the not neurotech database.')
else:
    print(f'Company "{company_name}" is not found in any database.')
print(f'Company "{company_name}" is in the main database: {in_main}')
print(f'Company "{company_name}" is in the not neurotech database: {in_not_neuro_tech}')

Company "Thrombotech Ltd" is found in the main database.
Company "Thrombotech Ltd" is in the main database: True
Company "Thrombotech Ltd" is in the not neurotech database: False


In [9]:
db_handler.new_companies_db.shape

(0, 61)

In [10]:
for i in range(1,4):
    print(i)
    brain_path = f'jan25/brain{i}.csv'
    db_handler.start_searching_process(brain_path, "tsun")

1
in tsun searcher: adding BRAIN.Q to new companies
in tsun searcher: brain.space is in main or not_neurotech
in tsun searcher: Brain1 is in main or not_neurotech
in tsun searcher: BrainBalance is in main or not_neurotech
in tsun searcher: adding BrainCommerce to new companies
in tsun searcher: BrainFulness is in main or not_neurotech
in tsun searcher: Brainkos is in main or not_neurotech
in tsun searcher: BrainsGate is in main or not_neurotech
in tsun searcher: adding BrainStorm to new companies
in tsun searcher: Brainsway is in main or not_neurotech
in tsun searcher: BrainVivo is in main or not_neurotech
in tsun searcher: BrainWatch Tech is in main or not_neurotech
in tsun searcher: Artbrain is in main or not_neurotech
in tsun searcher: Autobrains is in main or not_neurotech
in tsun searcher: BestBrain is in main or not_neurotech
in tsun searcher: ELDA BrainTech is in main or not_neurotech
in tsun searcher: Excellent Brain is in main or not_neurotech
in tsun searcher: i-BrainTech is 

In [11]:
for i in range(1,5):
    print(i)
    cognition = f'jan25/cognition{i}.csv'
    db_handler.start_searching_process(cognition, "tsun")

1
in tsun searcher: Zamir Recognition Systems is in main or not_neurotech
in tsun searcher: AbiliSense is in main or not_neurotech
in tsun searcher: ActualSignal is in main or not_neurotech
in tsun searcher: aidymo-cv is in main or not_neurotech
in tsun searcher: AIO is in main or not_neurotech
in tsun searcher: aiOla is in main or not_neurotech
in tsun searcher: Alango Technologies is in main or not_neurotech
in tsun searcher: Alsomine is in main or not_neurotech
in tsun searcher: Amplio is in main or not_neurotech
in tsun searcher: Anonybit is in main or not_neurotech
in tsun searcher: AntisepTech is in main or not_neurotech
in tsun searcher: AnyClip is in main or not_neurotech
in tsun searcher: adding Arbe Robotics to new companies
in tsun searcher: AudioCodes is in main or not_neurotech
in tsun searcher: Autobrains is in main or not_neurotech
in tsun searcher: Beffi is in main or not_neurotech
in tsun searcher: BioGuard is in main or not_neurotech
in tsun searcher: Bosco is in main

In [12]:
for i in range(1,3):
    print(i)
    cognitive = f'jan25/cognitive{i}.csv'
    db_handler.start_searching_process(cognition, "tsun")

1
in tsun searcher: Sodyo is in main or not_neurotech
in tsun searcher: Somatix is in main or not_neurotech
in tsun searcher: SpeakingPal is in main or not_neurotech
in tsun searcher: Supersmart is in main or not_neurotech
in tsun searcher: Suspect Detection Systems is in main or not_neurotech
in tsun searcher: Syte is in main or not_neurotech
in tsun searcher: Tactile World is in main or not_neurotech
in tsun searcher: TalkSense is in main or not_neurotech
in tsun searcher: Technoso is in main or not_neurotech
in tsun searcher: The Digital Pets Company is in main or not_neurotech
in tsun searcher: Theia Vision AI is in main or not_neurotech
in tsun searcher: ThirdEye Systems is in main or not_neurotech
in tsun searcher: Toky is in main or not_neurotech
in tsun searcher: TRACXPOiNT is in main or not_neurotech
in tsun searcher: Trax Retail is in main or not_neurotech
in tsun searcher: Trough.AI is in main or not_neurotech
in tsun searcher: Trullion is in main or not_neurotech
in tsun se

In [13]:
for i in range(1,12):
    print(i)
    mental = f'jan25/mental{i}.csv'
    db_handler.start_searching_process(mental, "tsun")

1
in tsun searcher: Madrigal Mental Care is in main or not_neurotech
in tsun searcher: 4Girls is in main or not_neurotech
in tsun searcher: adding A.B.A Science Play to new companies
in tsun searcher: adding A.T Efal Technologies to new companies
in tsun searcher: adding Acktar to new companies
in tsun searcher: adding Actelis Networks to new companies
in tsun searcher: adding ActiveAging to new companies
in tsun searcher: adding AeRotor Unmanned Systems to new companies
in tsun searcher: adding Agam Energy Systems to new companies
in tsun searcher: adding AGIL to new companies
in tsun searcher: adding AGM Communication & Control to new companies
in tsun searcher: adding AGRIDERA Seeds & Agriculture to new companies
in tsun searcher: adding AgriPass to new companies
in tsun searcher: adding Agrorim to new companies
in tsun searcher: adding Ahava Dead Sea Laboratories to new companies
in tsun searcher: adding AIO Systems to new companies
in tsun searcher: adding AIONZ to new companies
i

In [14]:
for i in range(1,6):
    print(i)
    neuro = f'jan25/neuro{i}.csv'
    db_handler.start_searching_process(neuro, "tsun")

1
in tsun searcher: Neuro-Can is in main or not_neurotech
in tsun searcher: NeuroAudit is in main or not_neurotech
in tsun searcher: NeuroBlade is in main or not_neurotech
in tsun searcher: Neurobrave is in main or not_neurotech
in tsun searcher: NeuroDerm is in main or not_neurotech
in tsun searcher: Neurogait is in main or not_neurotech
in tsun searcher: Neurogenesis is in main or not_neurotech
in tsun searcher: Neurogenic is in main or not_neurotech
in tsun searcher: NeuroHELP is in main or not_neurotech
in tsun searcher: NeuroKaire is in main or not_neurotech
in tsun searcher: Neurolief is in main or not_neurotech
in tsun searcher: Neuromagen Pharma is in main or not_neurotech
in tsun searcher: Neuronix AI Labs is in main or not_neurotech
in tsun searcher: NeuroQuest is in main or not_neurotech
in tsun searcher: NeuroSense Therapeutics is in main or not_neurotech
in tsun searcher: Neurosteer is in main or not_neurotech
in tsun searcher: Neurotech Solutions is in main or not_neurote

In [16]:
# Lets check the new copmanies database shape: 0 compnaies = the file is empty
db_handler.new_companies_db.shape

(486, 61)

In [17]:
# Let's view the new potenital companies from CrunchBase
cb_path = "jan25/crunchbase search.csv"
db_handler.start_searching_process(cb_path, "cb")

In [18]:
# Lets check the new crunchbase data shape
db_handler.new_companies_db.shape

(500, 62)

In [None]:
cb_path = "jan25\crunchbase search.csv"

In [13]:
# Let's start the search prcoess with the crunchbase file path and data_type as "cb"
db_handler.start_searching_process(file_path=cb_path, data_type="cb")
db_handler.new_companies_db.shape # Lets check the new copmanies data base shape

(18, 62)

In [15]:
db_handler.new_companies_db.head()

Unnamed: 0,Company Name,Updating_Date,Logo in Visualization folder?,"Operation Status (Active=True, False = False)",INCLUSION,Operation/relevant Notes,Website,Startup Nation Page,Neurotech Category,Market Category,...,Comments,Unnamed: 53,Contact Name,Contact Phone Number / Email,האם יצרנו איתם כבר קשר? (כדי לא להתיש),BrainstormIL contact,Unnamed: 58,Unnamed: 59,Normalized_Company_Name,Company_Location
0,NeuroKaire,,,,,,,,,,...,,,,,,,,,neurokaire,"Tel Aviv-yafo, Tel Aviv, Israel"
1,BrainQ,,,,,,,,,,...,,,,,,,,,brainq,"Jerusalem, Yerushalayim, Israel"
2,Anicca Health,,,,,,,,,,...,,,,,,,,,aniccahealth,"Tel Aviv-jaffa, Tel Aviv, Israel"
3,TABI,,,,,,,,,,...,,,,,,,,,tabi,"Ashdod, HaDarom, Israel"
4,GaitBetter,,,,,,,,,,...,,,,,,,,,gaitbetter,"Haifa, Hefa, Israel"


#### Update new copmanies