# NoSQL in Jupyter Notebook using Python

In this Notebook we will look at how we can simulate the most common NoSQL types using Native python.

(This is the version with a basic Graph database example at the end)



First we need to import a number of libraries to use in this notebook.

In [None]:
import  json

Next we create a simple key-value store using a pyton list. Each element of the list is a so-called tuple. The first element in the tuple is the key, the second is the value.

Example:
[(key1, value1), (key2, value2), .... ]

In [None]:
# create an empty list
key_value_list = []
# use append to add each element of the database
key_value_list.append(('this is a key','this is a value'))
key_value_list.append(('this is another key','another value'))
key_value_list.append(('key3','value3'))
key_value_list.append((4,45))
 

Show the list as constructed

In [None]:
key_value_list


Add some of your own elements

In [None]:
key_value_list.append(('here' ,'something'))

Can we easily find the value for the key 'key3'?

In [None]:
# does this work?
key_value_list['key3']

We may need to construct a helper function. This function returns the index of a tuple in our list.

In [None]:
def subtuple_index(tuples, t):
    def tuple2str(t): return ',{},'.format(','.join(str(t)))
    t = tuple2str(t)
    for i,x in enumerate(map(tuple2str, tuples)):
        if t in x: return i
    return -1

Let's try it

In [None]:
subtuple_index(key_value_list,'key3')

This means that the key 'key3' is part of element 2. (we start counting at 0).
Let's check:

In [None]:
key_value_list[2]

To get the value only we 'unwrap' the tuple, i.e. we get the 1 part of the tuple.

In [None]:
key_value_list[2][1]

We can create a function that does all this in one go:

In [None]:
def value_at_index(key_value_list, key):
  return key_value_list[subtuple_index(key_value_list,key)][1]

Let's try it:

In [None]:
value_at_index(key_value_list,'key3')

However, what if the key does not exist?

In [None]:

value_at_index(key_value_list,'this key does not exist6')

We get the last value in our list, as our helper function 'subtuple_index' returns a -1 when not found. And the -1 element of a python list is the last element. 

Let's fix this:

In [None]:
def value_at_index_fixed(key_value_list, key):
  index = subtuple_index(key_value_list,key)
  if index != -1:
    return key_value_list[index][1]
  else:
    return 'key not found!'

Try it for a valid key:

In [None]:
value_at_index_fixed(key_value_list, 'this is a key')

And a key not in our 'database':

In [None]:
value_at_index_fixed(key_value_list, 'this key does not exist')

## We can do this another more simple way

What if we use a python dictionary? This is actually the python built-in key valye store.

Syntax:
   dict[key] = value

In [None]:
# empty dictionary
key_value_dict = {}

key_value_dict

In [None]:
# put in the same keys and values as before
key_value_dict['this is a key'] ='this is a value'
key_value_dict['this is another key'] = 'another value'
key_value_dict['key3'] = 'value3'
key_value_dict[4] = 45

key_value_dict

Notice how this mixes up the ordering of the entries.

We do now have a much easier way to get the values:

In [None]:
key_value_dict['key3']

In [None]:
key_value_dict['this key does not exist']

# Document store

One way to do a document store in Python is to use JSON.

An example JSON object:

```example = { 'key1':'value1',
             'key2': { 'key3': 'value3'} }```

In [None]:
example = '{ "key1":"value1", "key2": { "key3": "value3"} }'
example

Hang on, this looks just like a dictionary.

We can use JSON utils in Python to convert the string into a dictionary.



In [None]:
json_object = json.loads(example)
json_object["key1"]

If we look at the value for "key2" we see it is not just a key/value store.
As the value for "key2" is another 'document'.

In [None]:
json_object["key2"]

Experiment with adding elements to json_object document database.

In [None]:
#create some extra content for json_object

#display json_object
json_object

# A graph database?

We can try the JSON document store for this.

With 2 basic types of document, the node and the edge (or relationship).*italicized text*

In [None]:
#a node will look like this
node1 = '{ "type":"node", "id": 1, "data": { "name": "Alice", "age":18} }'

node1_obj = json.loads(node1)

node1_obj

Some more elements that are nodes:

In [None]:
node2 = '{ "type":"node", "id": 2, "data": { "name": "Bob", "age":22} }'
node3 = '{ "type":"node", "id": 3, "data": { "Type": "Group", "Name":"Chess"} }'
node2_obj = json.loads(node2)
node3_obj = json.loads(node3)

node2_obj

Example of a relationship:

In [None]:
rel1 = '{ "type":"relationship", "id":100, "data":{ "Label":"knows", "Since": "2001/10/03"}, "source": 1, "target": 2}'
rel1_obj = json.loads(rel1)
rel1_obj

Some more relationships:

In [None]:
rel2 = '{ "type":"relationship", "id":101, "data":{ "Label":"knows", "Since": "2001/10/04"}, "source": 2, "target": 1}'
rel3 = '{ "type":"relationship", "id":102, "data":{ "Label":"is_member", "Since": "2005/7/01"}, "source": 1, "target": 3}'
rel4 = '{ "type":"relationship", "id":103, "data":{ "Label":"is_member", "Since": "2011/02/14"}, "source": 2, "target": 3}'

rel2_obj = json.loads(rel2)
rel3_obj = json.loads(rel3)
rel4_obj = json.loads(rel4)

rel2_obj

Create an empty 'database':

In [None]:
database = '{"nodes" : {}, "edges": {} }'
database_obj = json.loads(database)

database_obj

Add the nodes:

In [None]:
database_obj["nodes"][node1_obj['id']] = node1_obj
database_obj["nodes"][node2_obj['id']] = node2_obj
database_obj["nodes"][node3_obj['id']] = node3_obj

database_obj

Add the relationships:

In [None]:
database_obj["edges"][rel1_obj['id']] = rel1_obj
database_obj["edges"][rel2_obj['id']] = rel2_obj
database_obj["edges"][rel3_obj['id']] = rel3_obj

database_obj

We need a helper function to get an element with a property.

In [None]:
def get_element(document_store, search_obj):
  for i in document_store["nodes"].keys():
    for j in document_store["nodes"][i]["data"].keys():
      if (document_store["nodes"][i]["data"][j] == search_obj):
        print(f'Node {document_store["nodes"][i]["id"]} ({document_store["nodes"][i]["data"]}) has an element "{j}", which is equal to "{search_obj}"')
  for i in document_store["edges"].keys():
    for j in document_store["edges"][i]["data"].keys():
      if (document_store["edges"][i]["data"][j] == search_obj):
        print(f'Edge or relationship {document_store["edges"][i]["id"]} ({document_store["edges"][i]["data"]}) has an element "{j}", which is equal to "{search_obj}"')
        print(f'This relationship is between nodes {document_store["edges"][i]["source"]} and {document_store["edges"][i]["target"]}')

Get a node which contains 'Alice':

In [None]:
get_element(database_obj, "Alice")

Get an edge (relationship) which contains 'is_member':

In [None]:
get_element(database_obj, "is_member")

What if the search does not exist?

In [None]:
get_element(database_obj, "does not exist")

For all the other queries more helper functions are needed!

Left as an exercise to the reader 🙃