## NeuroFinder Processing Tool - Jupyter Notebook Version

Welcome to the NeuroFinder Processing Tool. This Jupyter Notebook provides a non-GUI interface to run the project and get familiar with its functionalities. You can process your data files, update databases, and generate reports directly within this notebook.

The NeuroFinder Processing Tool automates the management of a comprehensive database containing company information related to neurotechnology. It facilitates the import, standardization, validation, and updating of company data files in multiple formats (e.g., CSV, Excel).

### Objective of This Notebook

This notebook aims to:
* Provide an interactive environment to run the NeuroFinder Processing Tool without the GUI.
* Allow you to load data files, process them, and export the results.
* Help you get familiar with the tool's functionalities.

### Prerequisites

Before running this notebook, ensure you have:

* Python 3.x installed.
* Necessary Python packages (we will install them in the next step).
* Access to the data files you wish to process.
* The main database files (main_database.xlsx, not_neurotech_database.xlsx).

In [1]:
# Install required packages
!pip install pandas openpyxl requests python-dotenv matplotlib seaborn sqlite3


Collecting seaborn
  Using cached seaborn-0.13.2-py3-none-any.whl.metadata (5.4 kB)


ERROR: Could not find a version that satisfies the requirement sqlite3 (from versions: none)
ERROR: No matching distribution found for sqlite3


In [2]:
!python.exe -m pip install --upgrade pip




In [1]:
# Import standard libraries
import os
import re
import unicodedata
from datetime import datetime as dt

# Import third-party libraries
import pandas as pd
from dotenv import load_dotenv

import warnings
warnings.filterwarnings("ignore", category=UserWarning, module="openpyxl")


### Loading Environment Variables

If you have a .env file with environment variables, you can load it using python-dotenv. Otherwise, we can set default paths.

In [2]:
# Load environment variables
load_dotenv()
MAIN_DB_PATH = os.getenv('MAIN_DB_PATH')
NOT_NEUROTECH_DB_PATH = os.getenv('NOT_NEUROTECH_DB_PATH')
NEW_COMPANIES_PATH = os.getenv('NEW_COMPANIES_PATH')
UPDATED_COMPANIES_PATH = os.getenv('UPDATED_COMPANIES_PATH')


Defining Helper Functions

In [3]:
def clean_value(value):
    """Cleans the input value by stripping unwanted characters and converting to int if possible."""
    if pd.isna(value):
        return value
    cleaned_value = str(value).strip('="')
    try:
        return int(cleaned_value)
    except ValueError:
        return cleaned_value

def clean_dataframe(filepath, file_type='csv'):
    """Reads a file into a DataFrame, cleans it, and returns the cleaned DataFrame."""
    read_function = pd.read_csv if file_type == 'csv' else pd.read_excel
    df = read_function(filepath, index_col=False,
                       engine='openpyxl' if file_type == 'excel' else None)
    if 'former company names' in df.columns:
        df['former company names'] = df['former company names'].astype(str)
    for col in df.columns:
        df[col] = df[col].apply(clean_value)
    return df

def escape_special_characters(name: str) -> str:
    """Replaces special characters in a filename with underscores to ensure compatibility."""
    return re.sub(r'[^a-zA-Z0-9-_]', '_', name)


### Initializing the Database Handler

Create an instance of the DbHandler class to manage your databases.

In [4]:
from main.backend import DbHandler
# Initialize the database handler
db_handler = DbHandler(MAIN_DB_PATH, NOT_NEUROTECH_DB_PATH)

# Review the data

In [5]:
db_handler.main_db.describe()

Unnamed: 0,Company Founded Year,Last Funding Amount,Total Funding Amount,Number of Funding Rounds,Company Number of Investors,Company Number of Investments,acquired,Inactive Year,Number of Patents,Unnamed: 53,Contact Name
count,262.0,93.0,108.0,124.0,116.0,6.0,95.0,54.0,1.0,0.0,0.0
mean,2013.454198,9096734.0,25796170.0,2.548387,3.939655,3.0,0.105263,2019.222222,13.0,,
std,11.614324,18588340.0,69100250.0,2.123637,4.175058,1.264911,0.30852,3.451369,,,
min,1905.0,10000.0,16000.0,0.0,0.0,1.0,0.0,2007.0,13.0,,
25%,2011.0,850000.0,1310000.0,1.0,1.0,2.25,0.0,2017.0,13.0,,
50%,2016.0,2200000.0,4000000.0,2.0,2.0,3.5,0.0,2019.0,13.0,,
75%,2019.0,10000000.0,22117500.0,3.0,5.0,4.0,0.0,2022.0,13.0,,
max,2024.0,150000000.0,556900000.0,10.0,20.0,4.0,1.0,2024.0,13.0,,


In [6]:
db_handler.main_db.shape
# 659 companies X 58 columns (features) in the main database

(273, 61)

In [7]:
db_handler.main_db.head()

Unnamed: 0,Company Name,Updating_Date,Logo in Visualization folder?,"Operation Status (Active=True, False = False)",INCLUSION,Operation/relevant Notes,Website,Startup Nation Page,Neurotech Category,Market Category,...,Number of Patents,Comments,Unnamed: 53,Contact Name,Contact Phone Number / Email,האם יצרנו איתם כבר קשר? (כדי לא להתיש),BrainstormIL contact,Unnamed: 58,Unnamed: 59,Normalized_Company_Name
0,AcousticView,2024-02-14 00:00:00,yes,True,True,,http://www.acousticview.com/,https://finder.startupnationcentral.org/compan...,Imaging | Neuromonitoring,Medical devices | Medical equipment,...,,,,,,,,,,acousticview
1,ActualSignal,2024-07-14 00:00:00,No,True,True,,https://www.actualsignal.com/,https://finder.startupnationcentral.org/compan...,NeuroreHabilitation | NeuroDegenerative | Neur...,Digital & Health care,...,,,,,,,,,,actualsignal
2,Adam CogTech,2024-02-14 00:00:00,yes,True,True,website does work,http://adam-cogtec.com/,https://finder.startupnationcentral.org/compan...,Cognitive Assessment & Enhancement,Consumer Electronics,...,,,,,,,אסף הראל,,,adamcogtech
3,AlgoSensus,2024-02-14 00:00:00,yes,True,True,website does work,https://www.algosensus.com/,https://finder.startupnationcentral.org/compan...,Cognitive Assessment & Enhancement,Medical devices | Medical equipment,...,,,,,,,,,,algosensus
4,Alpha Omega,2024-02-14 00:00:00,yes,True,True,,http://www.alphaomega-eng.com,https://finder.startupnationcentral.org/compan...,NeuroSurgery | NeuroDevices,Medical devices | Medical equipment,...,13.0,,,,,,,,,alphaomega


## Review functions

In [8]:
# Lets check if a company is in the database or in the not neurotech database
company_name = 'Thrombotech Ltd'
in_main = db_handler.is_company_in_database(company_name, db_handler.main_db)
in_not_neuro_tech = db_handler.is_company_in_database(company_name, db_handler.not_neurotech_db)
if in_main:
    print(f'Company "{company_name}" is found in the main database.')
elif in_not_neuro_tech:
    print(f'Company "{company_name}" is found in the not neurotech database.')
else:
    print(f'Company "{company_name}" is not found in any database.')
print(f'Company "{company_name}" is in the main database: {in_main}')
print(f'Company "{company_name}" is in the not neurotech database: {in_not_neuro_tech}')

Company "Thrombotech Ltd" is found in the main database.
Company "Thrombotech Ltd" is in the main database: True
Company "Thrombotech Ltd" is in the not neurotech database: False


In [9]:
db_handler.new_companies_db.shape

(0, 61)

In [10]:
for i in range(1,4):
    print(i)
    brain_path = f'jan25/brain{i}.csv'
    db_handler.start_searching_process(brain_path, "tsun")

1
2
3


In [11]:
for i in range(1,5):
    print(i)
    cognition = f'jan25/cognition{i}.csv'
    db_handler.start_searching_process(cognition, "tsun")

1
BRAIN.Q is in new db - in tsun
2
3
4


In [12]:
for i in range(1,3):
    print(i)
    cognitive = f'jan25/cognitive{i}.csv'
    db_handler.start_searching_process(cognition, "tsun")

1
Wedge is in new db - in tsun
2
Wedge is in new db - in tsun


In [13]:
for i in range(1,12):
    print(i)
    mental = f'jan25/mental{i}.csv'
    db_handler.start_searching_process(mental, "tsun")

1
Arbe Robotics is in new db - in tsun
2
3
4
5
6
,=Agriculture & Food Technologies is in new db - in tsun
7
8
9
10
11
Wizermed is in new db - in tsun


In [14]:
for i in range(1,6):
    print(i)
    neuro = f'jan25/neuro{i}.csv'
    db_handler.start_searching_process(neuro, "tsun")

1
2
BRAIN.Q is in new db - in tsun
BrainStorm is in new db - in tsun
Cogntiv is in new db - in tsun
EndorTech is in new db - in tsun
EndoStream Medical is in new db - in tsun
3
IntoSleep is in new db - in tsun
LuSeed Vascular is in new db - in tsun
Matricelf is in new db - in tsun
4
Wedge is in new db - in tsun
5
NRx Pharmaceuticals is in new db - in tsun
Nutaria is in new db - in tsun
Pimea AI is in new db - in tsun
Prilenia Therapeutics is in new db - in tsun
Qrons is in new db - in tsun
Tendermind is in new db - in tsun


In [15]:
# Lets check the new copmanies database shape: 0 compnaies = the file is empty
db_handler.new_companies_db.shape

(506, 61)

In [16]:
db_handler.new_companies_db.tail()

Unnamed: 0,Company Name,Updating_Date,Logo in Visualization folder?,"Operation Status (Active=True, False = False)",INCLUSION,Operation/relevant Notes,Website,Startup Nation Page,Neurotech Category,Market Category,...,Number of Patents,Comments,Unnamed: 53,Contact Name,Contact Phone Number / Email,האם יצרנו איתם כבר קשר? (כדי לא להתיש),BrainstormIL contact,Unnamed: 58,Unnamed: 59,Normalized_Company_Name
501,,,,,,,,https://finder.startupnationcentral.org/compan...,,,...,,,,,,,,,,prileniatherapeutics
502,,,,,,,,https://finder.startupnationcentral.org/compan...,,,...,,,,,,,,,,qrons
503,SurgiAI,,,,,,,https://finder.startupnationcentral.org/compan...,,,...,,,,,,,,,,surgiai
504,Synaptiflora,,,,,,,https://finder.startupnationcentral.org/compan...,,,...,,,,,,,,,,synaptiflora
505,,,,,,,,https://finder.startupnationcentral.org/compan...,,,...,,,,,,,,,,tendermind


In [17]:
# Let's view the new potenital companies from CrunchBase
cb_path = "jan25/crunchbase search.csv"
db_handler.start_searching_process(cb_path, "cb")

BrainQ is in new db - in cb
GaitBetter is in new db - in cb


In [18]:
# Lets check the new crunchbase data shape
db_handler.new_companies_db.shape

(522, 62)

In [19]:
db_handler.new_companies_db.tail()

Unnamed: 0,Company Name,Updating_Date,Logo in Visualization folder?,"Operation Status (Active=True, False = False)",INCLUSION,Operation/relevant Notes,Website,Startup Nation Page,Neurotech Category,Market Category,...,Comments,Unnamed: 53,Contact Name,Contact Phone Number / Email,האם יצרנו איתם כבר קשר? (כדי לא להתיש),BrainstormIL contact,Unnamed: 58,Unnamed: 59,Normalized_Company_Name,Company_Location
517,New Bio Technology,,,,,,,,,,...,,,,,,,,,newbiotechnology,"Or Akiva, Hefa, Israel"
518,Slavgroup,,,,,,,,,,...,,,,,,,,,slavgroup,"Rosh Ha'ayin, HaMerkaz, Israel"
519,Insight Sparks,,,,,,,,,,...,,,,,,,,,insightsparks,"Tel Aviv, Tel Aviv, Israel"
520,NEURONIX,,,,,,,,,,...,,,,,,,,,neuronix,"Yoqne`am `illit, HaZafon, Israel"
521,CogniZance,,,,,,,,,,...,,,,,,,,,cognizance,"Tel Aviv, Tel Aviv, Israel"


In [None]:
db_handler.new_companies_d

KeyError: 'Brainq'

#### Update new copmanies