# Industry relation mapping with NLP and Neo4j

The following notebook helps create a graph to analyse the relationship between different industries to find potential connections between to seemingly disconnected industries.

The Data on the industries was obtained from the National industry classification report created by the government of india, most probably to track taxable industries.

The data in the report was converted into a csv file and cypher queries for neo4j were created using pandas, and directly ran on a neo4j database using the neo4j driver.

Entity recognition was used to find relations between different sections and create mapings in the graph database.

## Importing required libraries for cypher query generation for creation of nodes


In [2]:
import os
import pandas as pd
import re
import numpy as np
from neo4j import __version__ as neo4j_version
from neo4j import GraphDatabase

## Neo4j Python driver setup

Change the uri, user, pass values to connect to your dataset

In [3]:
class Neo4jConnection:
    
    def __init__(self, uri, user, pwd):
        self.__uri = uri
        self.__user = user
        self.__pwd = pwd
        self.__driver = None
        try:
            self.__driver = GraphDatabase.driver(self.__uri, auth=(self.__user, self.__pwd))
        except Exception as e:
            
            print("Failed to create the driver:", e)
        
    def close(self):
        if self.__driver is not None:
            self.__driver.close()
        
    def query(self, query, db=None):
        assert self.__driver is not None, "Driver not initialized!"
        session = None
        response = None
        try: 
            session = self.__driver.session(database=db) if db is not None else self.__driver.session() 
            response = list(session.run(query))
        except Exception as e:
            print("Query failed:", e)
        finally: 
            if session is not None:
                session.close()
        return response

# change uri , user and password    
conn = Neo4jConnection(uri="bolt://localhost:7687", user="user", pwd="pass")

## Loading and cleaning the CSV

In [4]:
df = pd.read_csv('NIC.csv')
df.head()

Unnamed: 0,Group,Class,Sub-class,Description,Unnamed: 4,Unnamed: 5,Unnamed: 6,Unnamed: 7,Unnamed: 8,Unnamed: 9,Unnamed: 10
0,"SECTION A : AGRICULTURE, FORESTY AND FISHING",,,,,,,,,,
1,"DiviSion 01 : Crop anD animal proDuCtion, hunt...",,,,,,,,,,
2,11,,,Growing of non-perennial crops,,,,,,,
3,,111.0,,"Growing of cereals (except rice), leguminous c...",,,,,,,
4,,,,This class includes all forms of growing of ce...,,,,,,,


In [5]:
df = df.drop(["Unnamed: 4", "Unnamed: 5", "Unnamed: 6", "Unnamed: 7",
             "Unnamed: 8", "Unnamed: 9", "Unnamed: 10"], axis=1)
df.head()

Unnamed: 0,Group,Class,Sub-class,Description
0,"SECTION A : AGRICULTURE, FORESTY AND FISHING",,,
1,"DiviSion 01 : Crop anD animal proDuCtion, hunt...",,,
2,11,,,Growing of non-perennial crops
3,,111.0,,"Growing of cereals (except rice), leguminous c..."
4,,,,This class includes all forms of growing of ce...


## Creating Cypher Queries

The way the National Industrial Classification report indexes the known industries is by breaking it down into the following structure:

Section of Industry <--- Division of section <--- Group of Industry <--- Class of industy <--- Subclass of industry

Hence the nodes in the neo4j graph also have to branch out in the same orderly fashion.

The csv file was processed with regex and string matching techniques to obtain the required outputs.

In [6]:
cypher_text = []

## Creating nodes for the sections of industry

In [7]:
i = 0

while i < len(df):
    if str(df.loc[i, "Group"]).startswith("SECTION"):
        temp_for_section = str(df.loc[i, "Group"])
        required_cypher_text=("CREATE(" + temp_for_section[0:7]+"_"+temp_for_section[8] + ":SECTION {name:'" + temp_for_section[11:] +
              "'})-[:SECTION_OF_INDUSTRY]->(INDUSTRY)-[:SECTION_OF_INDUSTRY]->("+temp_for_section[0:7]+"_"+temp_for_section[8]+")")
        print(required_cypher_text)
        cypher_text.append(required_cypher_text)          
    
    i += 1



CREATE(SECTION_A:SECTION {name:' AGRICULTURE, FORESTY AND FISHING'})-[:SECTION_OF_INDUSTRY]->(INDUSTRY)-[:SECTION_OF_INDUSTRY]->(SECTION_A)
CREATE(SECTION_C:SECTION {name:' MANUFACTURING'})-[:SECTION_OF_INDUSTRY]->(INDUSTRY)-[:SECTION_OF_INDUSTRY]->(SECTION_C)
CREATE(SECTION_D:SECTION {name:' ELECTRICITY, GAS, STEAM AND AIRCONDITION SUPPLY'})-[:SECTION_OF_INDUSTRY]->(INDUSTRY)-[:SECTION_OF_INDUSTRY]->(SECTION_D)
CREATE(SECTION_F:SECTION {name:' CONSTRUCTION'})-[:SECTION_OF_INDUSTRY]->(INDUSTRY)-[:SECTION_OF_INDUSTRY]->(SECTION_F)
CREATE(SECTION_G:SECTION {name:' WHOLESALE AND RETAIL TRADE; REPAIR OF MOTOR'})-[:SECTION_OF_INDUSTRY]->(INDUSTRY)-[:SECTION_OF_INDUSTRY]->(SECTION_G)
CREATE(SECTION_H:SECTION {name:' TRANAPORT AND STORAGE'})-[:SECTION_OF_INDUSTRY]->(INDUSTRY)-[:SECTION_OF_INDUSTRY]->(SECTION_H)
CREATE(SECTION_J:SECTION {name:' INFORMATION AND COMMUNICATION'})-[:SECTION_OF_INDUSTRY]->(INDUSTRY)-[:SECTION_OF_INDUSTRY]->(SECTION_J)
CREATE(SECTION_M:SECTION {name:' PROFESSIONAL, 

## Creating nodes for the division of industry (division branches from section)

In [8]:

i = 0
set_section = ""

while i < len(df):

    if str(df.loc[i, "Group"]).startswith("SECTION"):
        temp_for_section = str(df.loc[i, "Group"])
        set_section = temp_for_section[0:7]+"_"+temp_for_section[8]

    if str(df.loc[i, "Group"]).startswith("DiviSion"):
        temp_for_division = str(df.loc[i, "Group"])
        required_cypher_text=("CREATE("+temp_for_division[0:8]+temp_for_division[9:11] + ":DiviSion {name:'" + temp_for_division[13:] +
              "'})-[:Division_under_section]->(" + set_section + ")-[:Division_under_section]->(" + temp_for_division[0:8]+temp_for_division[9:11] + ")")
    
        print(required_cypher_text)
        cypher_text.append(required_cypher_text)   
    
    i += 1



CREATE(DiviSion01:DiviSion {name:' Crop anD animal proDuCtion, huntinG anD relateD ServiCe aCtivitieS'})-[:Division_under_section]->(SECTION_A)-[:Division_under_section]->(DiviSion01)
CREATE(DiviSion02:DiviSion {name:' ForeStry anD loGGinG'})-[:Division_under_section]->(SECTION_A)-[:Division_under_section]->(DiviSion02)
CREATE(DiviSion03:DiviSion {name:' FiShinG anD aquaCulture'})-[:Division_under_section]->(SECTION_A)-[:Division_under_section]->(DiviSion03)
CREATE(DiviSion05:DiviSion {name:' mininG oF Coal anD liGnite'})-[:Division_under_section]->(SECTION_A)-[:Division_under_section]->(DiviSion05)
CREATE(DiviSion08:DiviSion {name:' other mininG anD quarryinG'})-[:Division_under_section]->(SECTION_A)-[:Division_under_section]->(DiviSion08)
CREATE(DiviSion09:DiviSion {name:' mininG Support ServiCe aCtivitieS'})-[:Division_under_section]->(SECTION_A)-[:Division_under_section]->(DiviSion09)
CREATE(DiviSion10:DiviSion {name:' manuFaCture oF FooD proDuCtS'})-[:Division_under_section]->(SEC

## Creating nodes for the group of industry (group branches from division)

In [9]:
i = 0
set_section = ""

while i < len(df):

    if str(df.loc[i, "Group"]).startswith("DiviSion"):
        temp_for_section = str(df.loc[i, "Group"])
        set_section = temp_for_section[0:8]+temp_for_section[9:11]

    if re.search("^[0-9]+$", str(df.loc[i, "Group"])):
        temp_for_division = str(df.loc[i, "Group"])
        required_cypher_text=("CREATE(Group_"+temp_for_division+":Group {name:'" + str(
            df.loc[i, "Description"]) + "'})-[:Group]->(" + set_section + ")-[:Group]->(Group_"+temp_for_division+")")
        print(required_cypher_text)
        cypher_text.append(required_cypher_text)     
    i += 1



CREATE(Group_11:Group {name:'Growing of non-perennial crops'})-[:Group]->(DiviSion01)-[:Group]->(Group_11)
CREATE(Group_12:Group {name:'Growing of perennial crops'})-[:Group]->(DiviSion01)-[:Group]->(Group_12)
CREATE(Group_13:Group {name:'plant propagation'})-[:Group]->(DiviSion01)-[:Group]->(Group_13)
CREATE(Group_14:Group {name:'animal production'})-[:Group]->(DiviSion01)-[:Group]->(Group_14)
CREATE(Group_15:Group {name:'mixed farming'})-[:Group]->(DiviSion01)-[:Group]->(Group_15)
CREATE(Group_16:Group {name:'Support activities to agriculture and post-harvest crop activities'})-[:Group]->(DiviSion01)-[:Group]->(Group_16)
CREATE(Group_17:Group {name:'hunting, trapping and related service activities'})-[:Group]->(DiviSion01)-[:Group]->(Group_17)
CREATE(Group_21:Group {name:'Silviculture and other forestry activities'})-[:Group]->(DiviSion02)-[:Group]->(Group_21)
CREATE(Group_22:Group {name:'logging'})-[:Group]->(DiviSion02)-[:Group]->(Group_22)
CREATE(Group_23:Group {name:'Gathering of

## Creating nodes for the class of industry (class branches from group)

In [10]:
i=0
set_section=""
sum=0

while i<len(df):
    
    
    if re.search("^[0-9]+$", str(df.loc[i,"Group"])) :
        temp_for_section ="Group_"+str(df.loc[i,"Group"])
        set_section= temp_for_section 
        #print(set_section)
    
    description="""  """
        
  
        
    if re.search("^[0-9]+$", str(df.loc[i,"Class"])) :
        temp_for_class =str(df.loc[i,"Class"])
        
        j=i
        while str(df.loc[j,"Sub-class"])=="nan" :
            #print(df.loc[j,"Description"])
            description = description+" " + str(df.loc[j,"Description"])
            #print(description)
            j=j+1
        
        sum=sum+1
        
        
        required_cypher_text=("\nCREATE(Class_"+temp_for_class+":Class {name:'" + str(df.loc[i,"Description"])+"',Description :'"+description +"'})-[:Group_to_class]->(" + set_section + ")-[:Group_to_class]->(Class_"+temp_for_class+")")
        print(required_cypher_text)
        cypher_text.append(required_cypher_text)  
        
    
    
    i=i+1




CREATE(Class_111:Class {name:'Growing of cereals (except rice), leguminous crops and oil seeds',Description :'   Growing of cereals (except rice), leguminous crops and oil seeds This class includes all forms of growing of cereals, leguminous crops and oil seeds in open fields, including those considered organic farming and the growing of genetically modified crops. The growing of these crops is often combined within agricultural units This class excludes: - growing of maize for fodder, see 0119'})-[:Group_to_class]->(Group_11)-[:Group_to_class]->(Class_111)

CREATE(Class_112:Class {name:'Growing of rice',Description :'   Growing of rice This class includes the growing of rice, including organic farming and the growing of genetically modified rice.'})-[:Group_to_class]->(Group_11)-[:Group_to_class]->(Class_112)

CREATE(Class_113:Class {name:'Growing of vegetables and melons, roots and tubers',Description :'   Growing of vegetables and melons, roots and tubers This class excludes: - gro

## Creating nodes for the sub-class of industry (sub-class branches from class)

In [11]:
i = 0
set_section = ""
sum = 0

while i < len(df):
    if re.search("^[0-9]+$", str(df.loc[i, "Class"])):
        temp_for_section = "Class_"+str(df.loc[i, "Class"])
        set_section = temp_for_section

    description = """  """

    if re.search("[0-9]+$", str(df.loc[i, "Sub-class"])):
        temp_for_sub_class = str(df.loc[i, "Sub-class"])
        temp_for_sub_class = temp_for_sub_class[:-2]

        j = 0
        description = description = df.loc[i, "Description"]

        if j < len(df) and str(df.loc[j, "Sub-class"]) == "nan":
            while str(df.loc[j, "Sub-class"]) == "nan":
                description = description+" " + str(df.loc[j, "Description"])
                j = j + 1

        required_cypher_text=("CREATE(Subclass_"+temp_for_sub_class +
              ":Subclass {name:'" + description+"'})-[:CLASS]->(" + set_section + ")-[:CLASS]->(Subclass_"+temp_for_sub_class+")")

        print(required_cypher_text)
        cypher_text.append(required_cypher_text)
        
        sum = sum + 1

    i += 1





CREATE(Subclass_1111:Subclass {name:'Growing of wheat nan nan Growing of non-perennial crops Growing of cereals (except rice), leguminous crops and oil seeds This class includes all forms of growing of cereals, leguminous crops and oil seeds in open fields, including those considered organic farming and the growing of genetically modified crops. The growing of these crops is often combined within agricultural units This class excludes: - growing of maize for fodder, see 0119'})-[:CLASS]->(Class_111)-[:CLASS]->(Subclass_1111)
CREATE(Subclass_1112:Subclass {name:'Growing of jowar, bajra and millets nan nan Growing of non-perennial crops Growing of cereals (except rice), leguminous crops and oil seeds This class includes all forms of growing of cereals, leguminous crops and oil seeds in open fields, including those considered organic farming and the growing of genetically modified crops. The growing of these crops is often combined within agricultural units This class excludes: - growing 

## Printing out all the cypher queries

In [12]:
print(cypher_text)

["CREATE(SECTION_A:SECTION {name:' AGRICULTURE, FORESTY AND FISHING'})-[:SECTION_OF_INDUSTRY]->(INDUSTRY)-[:SECTION_OF_INDUSTRY]->(SECTION_A)", "CREATE(SECTION_C:SECTION {name:' MANUFACTURING'})-[:SECTION_OF_INDUSTRY]->(INDUSTRY)-[:SECTION_OF_INDUSTRY]->(SECTION_C)", "CREATE(SECTION_D:SECTION {name:' ELECTRICITY, GAS, STEAM AND AIRCONDITION SUPPLY'})-[:SECTION_OF_INDUSTRY]->(INDUSTRY)-[:SECTION_OF_INDUSTRY]->(SECTION_D)", "CREATE(SECTION_F:SECTION {name:' CONSTRUCTION'})-[:SECTION_OF_INDUSTRY]->(INDUSTRY)-[:SECTION_OF_INDUSTRY]->(SECTION_F)", "CREATE(SECTION_G:SECTION {name:' WHOLESALE AND RETAIL TRADE; REPAIR OF MOTOR'})-[:SECTION_OF_INDUSTRY]->(INDUSTRY)-[:SECTION_OF_INDUSTRY]->(SECTION_G)", "CREATE(SECTION_H:SECTION {name:' TRANAPORT AND STORAGE'})-[:SECTION_OF_INDUSTRY]->(INDUSTRY)-[:SECTION_OF_INDUSTRY]->(SECTION_H)", "CREATE(SECTION_J:SECTION {name:' INFORMATION AND COMMUNICATION'})-[:SECTION_OF_INDUSTRY]->(INDUSTRY)-[:SECTION_OF_INDUSTRY]->(SECTION_J)", "CREATE(SECTION_M:SECTION

## Creating the neo4j graph from all the cypher queries

In [25]:
len(cypher_text)

2029

In [13]:
for element in cypher_text:
    conn.query(element)

# Creating Relations between classes of industries

Now that we have all the known industries segmented and neatly stored in a directed graph in neo4j, it's time to find relations between different classes of industries and create edges to connect them to show the relationships in the neo4j graph.

To find relations , we would ideally use all sorts of text informations such as news articles, blogs and literature corpuses but since we dont have the time right now, we're just going to use the description of the industries provided in the NIC report. 

(We will run spacy entity recognition)-> commented out

We will run the rake algortihm provided in the nltk library to find keywords from the descriptions. We will then take all the classes and see which classes have similar or the same keywords and create edges between those nodes to show that their is a relationship between those two industries.

You may ask what the point of mapping this industry is if the relation is already mentioned in a text corpus. The relationships formed from the text corpuses only have a depth of 1.

When the graph is analysed, you will notice that you most industry relations have a depth of more than 1. 

What I mean by depth this is :

text corpus one = "The coffee is made with milk"

relation:- coffee -> milk (depth = 1)

text corpus two = "The cow makes milk"

relation:- milk -> cow (depth = 1)

now when we look at the graph what we would see is: 

coffee -> milk -> cow (depth = 2)

We will notice that the coffee industry is related to the dairy industry (This example may seem stupid but think about more complex industries like machine parts and greater depth of relations).






In [14]:
import csv
#import spacy
from rake_nltk import Rake

## Extarcting Keywords and storing them in a csv

In [15]:
"""
nlp = spacy.load("en_core_web_sm")
def get_entities(Article):
   
   doc = nlp(Article)
   entities_of_article = doc.ents
   for ent in entities_of_article:
       print(ent.text,ent.label_)
   return entities_of_article
"""

rake_nltk_var=Rake()
#Rapid Automatic Keyword Extraction
def get_keywords_using_rake_algorithm_from_nltk(text):
    rake_nltk_var.extract_keywords_from_text(text)
    keyword_extracted = rake_nltk_var.get_ranked_phrases()
    return keyword_extracted



Class_name = conn.query("Match(n:Class) return n.name")
class_descriptions=conn.query("Match(n:Class) return n.Description")

f = open('keyword_data_for_class_node_1.csv', 'w')
writer = csv.writer(f)
row_1=["Name_of_class","Keywords_from_description"]
writer.writerow(row_1)

names=[]

for class_name in Class_name:
   name=str(class_name)
   name=name[15:-1]
   names.append(name)

keywords_list=[]
for description in class_descriptions:
    
   description=str(description)
    
   #print(name[16:-2])
    
   description_to_take_keywords_from= description[22:-1]
    
        #print(description_to_take_keywords_from)
        #print(get_entities(description_to_take_keywords_from))  

   keywords=(get_keywords_using_rake_algorithm_from_nltk(description))
   
   keywords_list.append(keywords)

i=0
while i<len(names):
   
   required_row=[str(names[i]),str(keywords_list[i])]
   writer.writerow(required_row)
   i=i+1

      
f.close()

## Creating a pandas dataframe with the keyword data csv

In [16]:
df = pd.read_csv('keyword_data_for_class_node_1.csv')
df_nic=pd.read_csv('NIC.csv')

## Creating another csv with just class names and placeholders of the class(id number)

In [17]:
class_place_holder_to_name=[]
i=0
set_section=""
sum=0

f = open('Class_place_holder_name_1.csv', 'w')
writer = csv.writer(f)
row_1=["Placeholder_of_class","Name"]
writer.writerow(row_1)

while i<len(df_nic):
    try:
        i=i+1
        if re.search("^[0-9]+$", str(df_nic.loc[i,"Group"])) :
            temp_for_section ="Group_"+str(df_nic.loc[i,"Group"])
            set_section= temp_for_section 
            #print(set_section)

        description="""  """



        if re.search("^[0-9]+$", str(df_nic.loc[i,"Class"])) :
            temp_for_class =str(df_nic.loc[i,"Class"])

            j=i
            while str(df_nic.loc[j,"Sub-class"])=="nan" :
                #print(df.loc[j,"Description"])
                description = description+" " + str(df_nic.loc[j,"Description"])
                #print(description)
                j=j+1

            sum=sum+1
            temp=temp_for_class
            string=str(df_nic.loc[i,"Description"])
            #class_place_holder_to_name.append(set_section)
            #class_place_holder_to_name.append(description)
            temp=[temp_for_class,str(df_nic.loc[i,"Description"])]
            writer.writerow(temp)
    except:
            print("one not in range")

one not in range


## Creating a pandas dataframe with the class names csv


In [19]:
df_placeholder=pd.read_csv("Class_place_holder_name_1.csv")
df_placeholder

Unnamed: 0,Placeholder_of_class,Name
0,111,"Growing of cereals (except rice), leguminous c..."
1,112,Growing of rice
2,113,"Growing of vegetables and melons, roots and tu..."
3,114,Growing of sugar cane
4,115,Growing of tobacco
...,...,...
337,7722,renting of video tapes and disks
338,7729,renting and leasing of other personal and hou...
339,7730,"renting and leasing of other machinery, equip..."
340,7740,Leasing of nonfinancial intangible assets


## Checking values

In [20]:
# have no idea what the .... this is 
keywords_of_class_i=df.loc[0,"Keywords_from_description"]
regex= r"[0-9]{4}"

re.findall(regex,keywords_of_class_i)

['0119']

## Creating cypher queries for adding the mapping edges for industries

In [21]:
import re
regex= r"[0-9]{4}"
i=0
cypher_queries_for_creating_relations=[]
while i< len(df.index):
    conn.query("Match(n:Class) return n.name")
    name_of_class=str(df.loc[i,"Name_of_class"])
    keywords_of_class_i=str(df.loc[i,"Keywords_from_description"])
    links_to_class_numbers= re.findall(regex,keywords_of_class_i)
    
    if len(links_to_class_numbers)!= 0:
        for link in links_to_class_numbers:
            if link.startswith("0"):
                link=link[1:]
                link=int(link)
                location=df_placeholder.index[df_placeholder["Placeholder_of_class"]==link]

                name=df_placeholder.loc[location,"Name"]
                name=str(list(name))
                name_of_class_to_link=name[2:-2]
                
                print("MATCH (n:Class) \nwhere n.name="+name_of_class+"\nMATCH (q:Class) \nwhere q.name='"+name_of_class_to_link+ "'\nCreate (n)-[:related_industries]->(q)-[:related_industries]->(n)")
            
            else:
                link=int(link)
                location=df_placeholder.index[df_placeholder["Placeholder_of_class"]==link]

                name=df_placeholder.loc[location,"Name"]
                name=str(list(name))
                name_of_class_to_link=name[2:-2]
                
                required_query = ("MATCH (n:Class) \nwhere n.name="+name_of_class+"\nMATCH (q:Class) \nwhere q.name='"+name_of_class_to_link+ "'\nCreate (n)-[:related_industries]->(q)-[:related_industries]->(n)")
                print(required_query)
                cypher_queries_for_creating_relations.append(required_query)
    i=i+1

MATCH (n:Class) 
where n.name='Growing of cereals (except rice), leguminous crops and oil seeds'
MATCH (q:Class) 
where q.name='Growing of other non-perennial crop'
Create (n)-[:related_industries]->(q)-[:related_industries]->(n)
MATCH (n:Class) 
where n.name='Growing of vegetables and melons, roots and tubers'
MATCH (q:Class) 
where q.name='Growing of spices, aromatic, drug and pharmaceutical crops'
Create (n)-[:related_industries]->(q)-[:related_industries]->(n)
MATCH (n:Class) 
where n.name='Growing of vegetables and melons, roots and tubers'
MATCH (q:Class) 
where q.name='plant propagation'
Create (n)-[:related_industries]->(q)-[:related_industries]->(n)
MATCH (n:Class) 
where n.name='Growing of sugar cane'
MATCH (q:Class) 
where q.name='Growing of vegetables and melons, roots and tubers'
Create (n)-[:related_industries]->(q)-[:related_industries]->(n)
MATCH (n:Class) 
where n.name='Growing of other non-perennial crop'
MATCH (q:Class) 
where q.name='Growing of spices, aromatic, dru

MATCH (n:Class) 
where n.name='mining of uranium and thorium ores'
MATCH (q:Class) 
where q.name='manufacture of basic precious and other non-ferrous metals'
Create (n)-[:related_industries]->(q)-[:related_industries]->(n)
MATCH (n:Class) 
where n.name='mining of uranium and thorium ores'
MATCH (q:Class) 
where q.name='manufacture of basic precious and other non-ferrous metals'
Create (n)-[:related_industries]->(q)-[:related_industries]->(n)
MATCH (n:Class) 
where n.name='mining of uranium and thorium ores'
MATCH (q:Class) 
where q.name='manufacture of basic chemicals'
Create (n)-[:related_industries]->(q)-[:related_industries]->(n)
MATCH (n:Class) 
where n.name='mining of other non-ferrous metal ores'
MATCH (q:Class) 
where q.name='manufacture of basic precious and other non-ferrous metals'
Create (n)-[:related_industries]->(q)-[:related_industries]->(n)
MATCH (n:Class) 
where n.name='mining of other non-ferrous metal ores'
MATCH (q:Class) 
where q.name='mining of uranium and thorium 

MATCH (n:Class) 
where n.name='manufacture of macaroni, noodles, couscous and similar farinaceous'
MATCH (q:Class) 
where q.name='manufacture of other food products n.e.c.'
Create (n)-[:related_industries]->(q)-[:related_industries]->(n)
MATCH (n:Class) 
where n.name='manufacture of prepared meals and dishes'
MATCH (q:Class) 
where q.name='other food service activities'
Create (n)-[:related_industries]->(q)-[:related_industries]->(n)
MATCH (n:Class) 
where n.name='manufacture of prepared meals and dishes'
MATCH (q:Class) 
where q.name='retail sale in non-specialized stores with food, beverages or tobacco'
Create (n)-[:related_industries]->(q)-[:related_industries]->(n)
MATCH (n:Class) 
where n.name='manufacture of prepared meals and dishes'
MATCH (q:Class) 
where q.name=' Wholesale of food, beverages and tobacco'
Create (n)-[:related_industries]->(q)-[:related_industries]->(n)
MATCH (n:Class) 
where n.name='manufacture of prepared meals and dishes'
MATCH (q:Class) 
where q.name='retail

MATCH (n:Class) 
where n.name='manufacture of other products of wood; manufacture of articles of cork,'
MATCH (q:Class) 
where q.name='other manufacturing n.e.c.'
Create (n)-[:related_industries]->(q)-[:related_industries]->(n)
MATCH (n:Class) 
where n.name='manufacture of other products of wood; manufacture of articles of cork,'
MATCH (q:Class) 
where q.name='manufacture of watches and clocks'
Create (n)-[:related_industries]->(q)-[:related_industries]->(n)
MATCH (n:Class) 
where n.name='manufacture of other products of wood; manufacture of articles of cork,'
MATCH (q:Class) 
where q.name='other manufacturing n.e.c.'
Create (n)-[:related_industries]->(q)-[:related_industries]->(n)
MATCH (n:Class) 
where n.name='manufacture of other products of wood; manufacture of articles of cork,'
MATCH (q:Class) 
where q.name='manufacture of games and toys'
Create (n)-[:related_industries]->(q)-[:related_industries]->(n)
MATCH (n:Class) 
where n.name='manufacture of other products of wood; manufact

MATCH (n:Class) 
where n.name='manufacture of basic iron and steel'
MATCH (q:Class) 
where q.name='Casting of iron and steel'
Create (n)-[:related_industries]->(q)-[:related_industries]->(n)
MATCH (n:Class) 
where n.name='manufacture of basic iron and steel'
MATCH (q:Class) 
where q.name='Casting of iron and steel'
Create (n)-[:related_industries]->(q)-[:related_industries]->(n)
MATCH (n:Class) 
where n.name='manufacture of basic iron and steel'
MATCH (q:Class) 
where q.name='Casting of iron and steel'
Create (n)-[:related_industries]->(q)-[:related_industries]->(n)
MATCH (n:Class) 
where n.name='manufacture of basic precious and other non-ferrous metals'
MATCH (q:Class) 
where q.name='manufacture of jewellery and related articles'
Create (n)-[:related_industries]->(q)-[:related_industries]->(n)
MATCH (n:Class) 
where n.name='manufacture of basic precious and other non-ferrous metals'
MATCH (q:Class) 
where q.name='Casting of non-ferrous metals'
Create (n)-[:related_industries]->(q)-[:

MATCH (n:Class) 
where n.name='manufacture of engines and turbines, except aircraft, vehicle and cycle'
MATCH (q:Class) 
where q.name='manufacture of electric motors, generators, transformers and electricity'
Create (n)-[:related_industries]->(q)-[:related_industries]->(n)
MATCH (n:Class) 
where n.name='manufacture of engines and turbines, except aircraft, vehicle and cycle'
MATCH (q:Class) 
where q.name='manufacture of electric motors, generators, transformers and electricity'
Create (n)-[:related_industries]->(q)-[:related_industries]->(n)
MATCH (n:Class) 
where n.name='manufacture of engines and turbines, except aircraft, vehicle and cycle'
MATCH (q:Class) 
where q.name='manufacture of other electrical equipment'
Create (n)-[:related_industries]->(q)-[:related_industries]->(n)
MATCH (n:Class) 
where n.name='manufacture of engines and turbines, except aircraft, vehicle and cycle'
MATCH (q:Class) 
where q.name='manufacture of air and spacecraft and related machinery'
Create (n)-[:rela

MATCH (n:Class) 
where n.name='manufacture of musical instruments'
MATCH (q:Class) 
where q.name=''
Create (n)-[:related_industries]->(q)-[:related_industries]->(n)
MATCH (n:Class) 
where n.name='manufacture of musical instruments'
MATCH (q:Class) 
where q.name='reproduction of recorded media'
Create (n)-[:related_industries]->(q)-[:related_industries]->(n)
MATCH (n:Class) 
where n.name='manufacture of musical instruments'
MATCH (q:Class) 
where q.name='Sound recording and music publishing activitiest'
Create (n)-[:related_industries]->(q)-[:related_industries]->(n)
MATCH (n:Class) 
where n.name='manufacture of musical instruments'
MATCH (q:Class) 
where q.name='repair of other equipment'
Create (n)-[:related_industries]->(q)-[:related_industries]->(n)
MATCH (n:Class) 
where n.name='manufacture of musical instruments'
MATCH (q:Class) 
where q.name='manufacture of games and toys'
Create (n)-[:related_industries]->(q)-[:related_industries]->(n)
MATCH (n:Class) 
where n.name='manufacture 

MATCH (n:Class) 
where n.name='Building completion and finishing'
MATCH (q:Class) 
where q.name='Specialized design activities'
Create (n)-[:related_industries]->(q)-[:related_industries]->(n)
MATCH (n:Class) 
where n.name='Building completion and finishing'
MATCH (q:Class) 
where q.name=''
Create (n)-[:related_industries]->(q)-[:related_industries]->(n)
MATCH (n:Class) 
where n.name='Building completion and finishing'
MATCH (q:Class) 
where q.name=''
Create (n)-[:related_industries]->(q)-[:related_industries]->(n)
MATCH (n:Class) 
where n.name='other specialized construction activities'
MATCH (q:Class) 
where q.name=' renting and leasing of other machinery, equipment and tangible goods'
Create (n)-[:related_industries]->(q)-[:related_industries]->(n)
MATCH (n:Class) 
where n.name='Sale of motor vehicles'
MATCH (q:Class) 
where q.name=' renting and leasing of motor vehicles'
Create (n)-[:related_industries]->(q)-[:related_industries]->(n)
MATCH (n:Class) 
where n.name='Sale of motor ve

MATCH (n:Class) 
where n.name='retail sale of textiles in specialized stores'
MATCH (q:Class) 
where q.name='retail sale of clothing, footwear and leather articles in specialized'
Create (n)-[:related_industries]->(q)-[:related_industries]->(n)
MATCH (n:Class) 
where n.name='Retail sale of carpets, rugs, wall and floor coverings in specialized'
MATCH (q:Class) 
where q.name='retail sale of hardware, paints and glass in specialized stores'
Create (n)-[:related_industries]->(q)-[:related_industries]->(n)
MATCH (n:Class) 
where n.name='retail sale of electrical household appliances, furniture, lighting'
MATCH (q:Class) 
where q.name='retail sale of second-hand goods'
Create (n)-[:related_industries]->(q)-[:related_industries]->(n)
MATCH (n:Class) 
where n.name='retail sale of books, newspapers and stationary in specialized stores'
MATCH (q:Class) 
where q.name='retail sale of second-hand goods'
Create (n)-[:related_industries]->(q)-[:related_industries]->(n)
MATCH (n:Class) 
where n.name=

MATCH (n:Class) 
where n.name='News agency activities'
MATCH (q:Class) 
where q.name=''
Create (n)-[:related_industries]->(q)-[:related_industries]->(n)
MATCH (n:Class) 
where n.name='News agency activities'
MATCH (q:Class) 
where q.name='photographic activities'
Create (n)-[:related_industries]->(q)-[:related_industries]->(n)
MATCH (n:Class) 
where n.name='Other monetary intermediation'
MATCH (q:Class) 
where q.name=' Activities auxiliary to financial service activities n.e.c.'
Create (n)-[:related_industries]->(q)-[:related_industries]->(n)
MATCH (n:Class) 
where n.name='Other monetary intermediation'
MATCH (q:Class) 
where q.name='other credit granting'
Create (n)-[:related_industries]->(q)-[:related_industries]->(n)
MATCH (n:Class) 
where n.name='other credit granting'
MATCH (q:Class) 
where q.name='Other monetary intermediation'
Create (n)-[:related_industries]->(q)-[:related_industries]->(n)
MATCH (n:Class) 
where n.name=' Security and commodity contracts brokerage'
MATCH (q:Clas

MATCH (n:Class) 
where n.name='Gambling and betting activities'
MATCH (q:Class) 
where q.name=''
Create (n)-[:related_industries]->(q)-[:related_industries]->(n)


## Updating the neo4j graph

In [24]:
for element in cypher_queries_for_creating_relations:
    conn.query(element)

Query failed: {code: Neo.ClientError.Statement.SyntaxError} {message: Invalid input ',': expected whitespace, '.', node labels or rel types, '[', '^', '*', '/', '%', '+', '-', "=~", IN, STARTS, ENDS, CONTAINS, IS, '=', "<>", "!=", '<', '>', "<=", ">=", AND, XOR, OR, USE GRAPH, LOAD CSV, START, MATCH, UNWIND, MERGE, CREATE UNIQUE, CREATE, SET, DELETE, REMOVE, FOREACH, WITH, CALL, RETURN, UNION, ';' or end of input (line 4, column 51 (offset: 150))
"where q.name='Raising of horses and other equines', 'manufacture of articles of fur'"
                                                   ^}
