# Notebook -- run
This notebook is used to run all the scripts of this project. Step by step, we create the data used in neo4j:   
the nodes and the relationships for the clustering graph and bitcoin network.    

Before running the script make sure that you have the bitcoin transactions data in JSON format stored in your  
working directory.

###### to modify a script :
%load script_name.py
###### get help for a function or script: 
help(functions_write_save_load_data)  
help(create_data_clustering_h1)


######  Set the necessary paths

In [None]:
import os
import sys
sys.path.insert(0, './source')
sys.path.insert(0, './run')
sys.path.insert(0, './neo4j_scripts')
print(os.getcwd())

In [None]:
# example to get help
import functions_write_save_load_data
help(functions_write_save_load_data)  

In [None]:
# example to get help : function
from functions_extract_transform_data import rewrite
help(rewrite)

In [None]:
# example to get help : script
import create_data_clustering_h1
help(create_data_clustering_h1)

# Step 1 : Execute create_data_clustering.py
This will transform the BTC data and create the data necessary for the addresses clustering.  
It will create the nodes and the relationships for the clustering graph.

In [None]:
# Generate all the necesary data for the clustering graph with heuristic 1
%time exec(open("run/create_data_clustering_h1.py").read())

In [None]:
# If you want to apply the change address heuristic to the data
%time exec(open("run/add_heuristic_change_address.py").read())

# Step 2 : Build the clustering graph on Neo4j
- The nodes and the relationships are already created and stored in the working directory.  
- Run on a terminal or on a SSH, the code stored in the script "neo4j_script_clustering_graph.sh".  
- Run neo4j. 
- Then comeback to this notebook and run the following cells.

In [None]:
# # On windows 
# from functions_write_save_load_data import run_terminal_script

# run_terminal_script('neo4j_scripts/neo4j_script_clustering_graph.bat')

In [None]:
# # On linux 
# from functions_write_save_load_data import run_terminal_script

# run_terminal_script('neo4j_scripts/neo4j_script_clustering_graph.sh')

In [None]:
# Connection to the Neo4j Graph (from laptop)
from py2neo import Graph

graph = Graph("bolt://localhost:7687", auth=("neo4j", "*****"))

In [None]:
# Connection to the Neo4j Graph in the server
from py2neo import Graph

graph = Graph("bolt://134.214.108.191:7687", auth=("neo4j", "++++++++"))

In [None]:
# Simple check
graph.run("MATCH (a:Addresses) RETURN a LIMIT 10").to_ndarray()

In [None]:
# Application of the algorithm UnionFind
graph.run(
    "CALL algo.unionFind('Addresses','SAME_ACT', {write:true, partitionProperty:'partition', concurrency:19})"
)

In [None]:
# Check the result
graph.run("MATCH (a:Addresses) RETURN a LIMIT 10").to_ndarray()

In [None]:
# Export the partitions. The result will be stored as neo4j partitions
# in the folder /var/lib/neo4j/import
graph.run(
    'CALL apoc.export.csv.query("MATCH (n:Addresses) RETURN n.partition AS cluster_id, n.name AS address_id","neo4j_partitions.csv", {})'
)
print("The partitions are stored in /var/lib/neo4j/import as neo4j_partitions.")

# Step 3: Export the partitions 
from /var/lib/neo4j/import to the python working directory ~/Bitcoin_Transaction_analysis  
To do that, run the script "neo4j_export_partitions.sh" on the terminal. 

In [None]:
# # on windows 
# from functions_write_save_load_data import run_terminal_script

# run_terminal_script('neo4j_script_export_partitions.bat')

# Step 4: Execute build_clusters.py
Running this script will build the clusters from the neo4j_partitions.csv to a dictionary.  
Then all unique addresses (clusters with size 1) is added to the clusters.

In [None]:
%time exec(open("run/build_clusters.py").read())

# Step 5: Execute create_data_general_graph.py
Create the csv files containing the data for the construction of the bitcoin graph.     
The nodes of the graph : the addresses, the transactions, the clusters    
The relationships of the graph : addresses-transactions, addresses-clusters      

The dictionary of labels is used in the 'create_data_general_graph.py' in order to add the  
true 'identity' of an address to the properties of the addresses nodes. To create this dictionary,  
run the following notebook : create_dictionary_of_identities.ipynb. You will need to have in your working directory the data : 
- addresses1.csv, addresses2.csv, addresses3.csv, addresses4.csv
- meiklejohn.csv
- bttalk.json

If you don't want to add the addresses label's to the graph, you can change it in the script   
'./source/nodes_links_using_dic.py' by specifying "dic_names = None" in the parameters of the function. 

In [None]:
# Optional, can be skipped
# Running this notebook will create the dictionary of names in the working directory
!jupyter nbconvert --to notebook --execute create_dictionary_of_identities.ipynb

In [None]:
# create the nodes, the links of the bitcoin user graph
%time exec(open("run/create_data_bitcoin_graph.py").read())

# Step 6: Build the Bitcoin Graph on Neo4j 
Go back to the SSH and run the script 'neo4j_script_bitcoin_graph.sh.'

In [None]:
# # On linux
# from functions_write_save_load_data import run_terminal_script

# run_terminal_script('neo4j_scripts/neo4j_script_bitcoin_graph.sh')

In [None]:
# check 
# Connection to the Neo4j Graph in the server
from py2neo import Graph

graph = Graph("bolt://134.214.108.191:7687", auth=("neo4j", "++++++++"))