# Notion Indexer Exploration


This notebook demonstrates how to use the Notion Python SDK to interact with a Notion page, fetch its content, and display it in Markdown format. 

In [1]:
%load_ext autoreload
%autoreload 2

## Install Notion Indexer

We are assuming that the code provided in the previous responses is in a folder named `notion_indexer` inside the current directory, where this notebook is located. We need to install it before we can use it. 


In [2]:
!cd ../; pip install -e .

Obtaining file:///mnt/c/Users/josep/Documents/DocumentsWSL/1-PROJECTS/essay-scoring-competition/notion-indexer
  Preparing metadata (setup.py) ... [?25ldone
[?25hInstalling collected packages: notion_indexer
  Attempting uninstall: notion_indexer
    Found existing installation: notion_indexer 0.1.0
    Uninstalling notion_indexer-0.1.0:
      Successfully uninstalled notion_indexer-0.1.0
  Running setup.py develop for notion_indexer
Successfully installed notion_indexer-0.1.0


##  ⚠️ Restart your jupyter notebook so that the previously installed library can be loaded ⚠️


## Import Libraries 


Import necessary classes and modules from your custom Notion indexer and the Notion Python SDK.

In [3]:
# Notion Indexer modules
from notion_indexer.notion_client import NotionClient
from notion_indexer.notion_reader import NotionReader

# Python libraries
import os
from dotenv import load_dotenv
from notion_client import Client
from IPython.display import display, Markdown
from notion_indexer.util import custom_get_id

## Load Environment Variables and Initialize Notion Client

Load your Notion integration token from a `.env` file (ensure to create one and put your token in it as `NOTION_TOKEN=your_actual_token`) and create a Notion client object.


In [4]:
load_dotenv()
integration_token = os.getenv("NOTION_TOKEN")

# Ensure that the NOTION_TOKEN is properly connected with the required notion page, blocks, and databases
notion_client = NotionClient(integration_token=integration_token)

In [5]:
l = notion_client.client.databases.query

In [6]:
notion_client.client.databases.query

<bound method DatabasesEndpoint.query of <notion_client.api_endpoints.DatabasesEndpoint object at 0x7f4876b48940>>

In [19]:
notion_nodes = []
notion_reader = NotionReader(integration_token=integration_token)
notion_node = notion_reader.load_data(
    "https://www.notion.so/joseph-maazal/Introducing-Navarasa-2-0-Indic-Gemma-7B-2B-Instruction-tuned-model-on-15-Indian-Languages-by-Rav-7bde469a1dec4ad6b3c1cecfb2244109#697deca2efb24602b16b0034b0780ffe",
    max_depth=1,
)
notion_nodes.append(notion_node)

In [20]:
notion_node = notion_reader.load_data(
    "https://www.notion.so/joseph-maazal/2b6cf01673584bd3b564eb92efd5233c?v=aa414de448fa41f8a497524773596cbe",
    max_depth=1,
)
notion_nodes.append(notion_node)

In [33]:
import io
import zipfile
import datetime
from shutil import make_archive
from notion_indexer.database_node import DatabaseNode

In [37]:
zip_buffer = io.BytesIO()
with zipfile.ZipFile(zip_buffer, "a", zipfile.ZIP_DEFLATED, False) as zip_file:
    for idx, node in enumerate(notion_nodes):
        # conver the url to a valid file name
        file_url = str(idx)

        if isinstance(node, DatabaseNode):
            zip_file.writestr(
                f"{file_url}.csv", node.to_dataframe().to_csv().encode("utf-8")
            )
        else:
            zip_file.writestr(f"{file_url}.md", node.to_markdown())

zip_buffer.seek(0)
file_data = zip_buffer.getvalue()
file_name = f"notion_nodes_{datetime.datetime.now().strftime('%Y-%m-%d_%H-%M-%S')}.zip"

with open(file_name, "wb") as f:
    f.write(file_data)

In [23]:
# display(Markdown(notion_nodes[1].to_markdown()))

In [None]:
print(notion_node.to_dict())

In [None]:
display(Markdown(notion_node.to_markdown()))

In [None]:
notion_node.to_dataframe().to_csv(index=False).encode("utf-8")

b'Related to N\xc3\xadcolas - Simulado Provas (Primeira Fase) (relation),PDF do Gabarito (url),Perc. Acertos (formula),Parent item (relation),T\xc3\xa9rmino da Prova (date),Link Simulado (relation),Dura\xc3\xa7\xc3\xa3o (min) (formula),Gabarito (select),C\xc3\xb3digo da Prova (rich_text),Aprendizados (rich_text),PDF da Prova (url),Quest\xc3\xa3o (number),Resposta (select),Resposta Correta? (formula),Fase (select),Dia (select),Sub-item (relation),Num. Acertos (formula),In\xc3\xadcio da Prova (date),Dificuldades (rich_text),Quest\xc3\xb5es (formula),T\xc3\xb3pico (multi_select),Vestibular (rollup),Nome (title)\n[],,,[{\'id\': \'33fcea95-78ca-47e9-8f73-4bf6fe9652ee\'}],,[],,C,,,,70.0,C,True,,,[],1.0,,,0,[],[],Questao 70\n[],,,[{\'id\': \'33fcea95-78ca-47e9-8f73-4bf6fe9652ee\'}],,[],,B,,,,69.0,D,False,,,[],0.0,,,0,[],[],Questao 69\n[],,,[{\'id\': \'33fcea95-78ca-47e9-8f73-4bf6fe9652ee\'}],,[],,A,,,,72.0,A,True,,,[],1.0,,,0,[],[],Questao 72\n[],,,[{\'id\': \'33fcea95-78ca-47e9-8f73-4bf6fe96

In [None]:
# dump notion_node into a pickle file
import pickle

# pickle.dump(notion_node, open("notion_node.pkl", "wb"))