<img width="10%" alt="Naas" src="https://landen.imgix.net/jtci2pxwjczr/assets/5ice39g4.png?w=160"/>

# Generate Readme for Awesome Notebooks

## Input

### Import librairies

In [1]:
import os
import requests
import pandas as pd
import urllib.parse
try:
    from git import Repo
except:
    !pip install GitPython
    from git import Repo
from naas_drivers import markdown
from pprint import pprint
import json

### Setup Variables
- `readme_template`: This variable stores the file name or path of the README template file. It is used as a template to generate the final README file.
- `naas_lab_logo`: This variable contains the URL of the Naas Lab logo image.
- `naas_chat_logo`: This variable holds the URL of the Naas Chat logo image.
- `template_request`: This variable represents the URL for submitting a template request on GitHub. It includes parameters such as assignees, labels, template, and title.
- `bug_report`: This variable stores the URL for submitting a bug report on GitHub. It includes parameters such as assignees, labels, template, and title.
- `start_data_product`: This variable contains the URL for the "Naas_Start_data_product" notebook.
- `json_file`: This variable represents the file name or path for the output JSON file that will store the templates.
- `naas_lab_url`: This variable holds the URL prefix for accessing Naas Lab resources.
- `naas_chat_url`: This variable represents the URL prefix for using Naas Chat plugins.
- `readme`: This variable stores the file name or path for the final README file.

In [2]:
# Inputs
readme_template = "README_template.md"
naas_lab_logo = "https://naasai-public.s3.eu-west-3.amazonaws.com/open_in_naas.svg"
naas_chat_logo = "https://naasai-public.s3.eu-west-3.amazonaws.com/open_in_naas.svg"
template_request = "https://github.com/jupyter-naas/awesome-notebooks/issues/new?assignees=&labels=&template=template-request.md&title=Tool+-+Action+of+the+notebook+"
bug_report = "https://github.com/jupyter-naas/awesome-notebooks/issues/new?assignees=&labels=bug&template=bug_report.md&title="
start_data_product = "https://raw.githubusercontent.com/jupyter-naas/awesome-notebooks/master/Naas/Naas_Start_data_product.ipynb"

# Outputs
json_file = "templates.json"
naas_lab_url ='https://app.naas.ai/user-redirect/naas/downloader?url='
naas_chat_url = "https://workspace.naas.ai/chat/use?plugin_url="
readme = "README.md"

## Model

### Get all notebooks

In [3]:
def get_all_notebooks():
    # Init
    html_url_base = "https://github.com/jupyter-naas/awesome-notebooks/blob/master"
    raw_url_base = "https://raw.githubusercontent.com/jupyter-naas/awesome-notebooks/master"
    notebooks = []
    res_json = []
    
    # Connect to GitHub and get branch
    repo = Repo('.')
    branch = repo.active_branch
    
    # Get tree from json
    url = f"https://api.github.com/repos/jupyter-naas/awesome-notebooks/git/trees/{branch.name}?recursive=1"
    res = requests.get(url)
    if res.status_code == 200:
        res_json = res.json().get("tree")
    
    # Loop on files
    for r in res_json:
        root = None
        file_name = None
        file_path = r.get("path")
        notebook_path = urllib.parse.quote(file_path)
        if ".github" not in file_path and ".gitignore" not in file_path and "/" in file_path:
            if file_path.endswith(".ipynb"):
                data = {
                    "tool": file_path.split("/")[0],
                    "notebook_name": file_path.split("/")[1],
                    "notebook_path": notebook_path,
                    "html_url": f"{html_url_base}/{notebook_path}",
                    "raw_url": f"{raw_url_base}/{notebook_path}",
                }
                notebooks.append(data)
    return pd.DataFrame(notebooks)

df_notebooks = get_all_notebooks()
print("✅ Notebooks fetched:", len(df_notebooks))
df_notebooks.head(1)

✅ Notebooks fetched: 752


Unnamed: 0,tool,notebook_name,notebook_path,html_url,raw_url
0,AWS,AWS_Daily_biling_notification_to_slack.ipynb,AWS/AWS_Daily_biling_notification_to_slack.ipynb,https://github.com/jupyter-naas/awesome-notebo...,https://raw.githubusercontent.com/jupyter-naas...


### Create header button, generate markdown & json 

In [5]:
def create_first_cell(
    tool,
    title,
    open_in_lab,
    naas_lab_logo,
    open_in_chat,
    naas_chat_logo,
    template_request,
    bug_report,
    naas_lab_url,
    start_data_product,
):
    # Init
    notebook_title = ''
    open_in_lab_url = ''
    open_in_chat_url = ''
    template_request_url = ''
    bug_report_url = ''
    generate_data_product_url = ''
    
    # Create notebook title
    notebook_title = f"# {tool} - {title}\n" # Jupyter Notebooks - Get libraries
    
    # Create logos
    if open_in_lab != "":
        open_in_lab_url = f"""<a href="{open_in_lab}" target="_parent"><img src="{naas_lab_logo}"/></a>"""
    if open_in_chat != "":
        open_in_chat_url = f"""<a href="{open_in_chat}" target="_parent"><img src="{naas_chat_logo}"/></a>"""
        
    # Hyperlinks
    template_request_url = f"""<a href="{template_request}">Template request</a>"""
    
    title_url = (f"{tool}+-+{title}").replace(" ", "+")
    bug_report_url = f"""<a href="{bug_report}{title_url}:+Error+short+description">Bug report</a>"""
    
    start_data_product_url = f"{naas_lab_url}{start_data_product}"
    generate_data_product_url = f"""<a href="{start_data_product_url}" target="_parent">Generate Data Product</a>"""
    return f"""{notebook_title}{open_in_lab_url}{open_in_chat_url}<br><br>{template_request_url} | {bug_report_url} | {generate_data_product_url}"""

def get_imports(sources, imports):
    # Loop on sources
    for source in sources:
        if "from" in source and "import" in source:
            lib = (
                source.replace("\n", "")
                .split("from")[-1]
                .split("import")[0]
                .strip()
            )
            module = (
                source.replace("\n", "")
                .split("import")[-1]
                .split(" as ")[0]
                .strip()
            )
            imports.append(f"{lib}.{module}")
        if "from" not in source and "import" in source:
            library = (
                source.replace("\n", "")
                .split("import")[-1]
                .split(" as ")[0]
                .strip()
            )
            imports.append(library)
    return imports

def get_notebook_info(url):
    # Init
    action = ""
    title = ""
    tags = ""
    author = ""
    author_url = ""
    description = ""
    plugin = False
    imports = []
    first_cell = []
    
    # Request
    res = requests.get(url)
#     return res.json()
    
    # Manage result
    if res.status_code == 200:
        res_json = res.json()
        cells = res_json.get("cells")
        
        # Get metadata store in fixed cells
        first_cell = cells[1].get("source")
        title = first_cell[0].replace("#", "").split("-")[-1].strip()
        action = "".join(first_cell).split("<br><br>")[0].split("\n")[-1].strip()
        tags = cells[2].get("source")[0].replace("**Tags:**", "").strip()
        tags = [f"#{tag.strip()}" for tag in tags.split("#") if tag != ""]
        author = cells[3].get("source")[0].replace("**Author:**", "").strip()
        author_name = author.replace("[", "").replace("]", "").split("(")[0].strip()
        if "(" in author:
            author_url = author.split("(")[-1].replace(")", "")
        description = "".join(cells[4].get("source")).replace("**Description:**", "").strip()
        
        # Get metadata store in variables cells
        for cell in cells:
            cell_type = cell.get("cell_type")
            metadata = cell.get("metadata")
            sources = cell.get("source")
            if cell_type == "code":
                imports = get_imports(sources, imports)
            if metadata:
                metadata_tags = metadata.get("tags")
                if len(metadata_tags) > 0 and not plugin:
                    for m in metadata_tags:
                        if m == "plugin":
                            plugin = True
                            break
        
    return title, action, tags, author_name, author_url, description, imports, plugin, "".join(first_cell)

url = "https://raw.githubusercontent.com/jupyter-naas/awesome-notebooks/master/OpenAI/OpenAI_Act_as_a_chef.ipynb"
title, action, tags, author_name, author_url, description, imports, plugin, first_cell = get_notebook_info(url)
print("- Title:", title)
print("- Action:", action)
print("- Tags:", tags)
print("- Author:", author_name)
print("- Author URL:", author_url)
print("- Description:", description)
print("- Imports:", imports)
print("- Plugin:", plugin)
print("- First cell:", first_cell)

- Title: Act as a chef
- Action: <a href="https://app.naas.ai/user-redirect/naas/downloader?url=https://raw.githubusercontent.com/jupyter-naas/awesome-notebooks/master/OpenAI/OpenAI_Act_as_a_chef.ipynb" target="_parent"><img src="https://naasai-public.s3.eu-west-3.amazonaws.com/open_in_naas.svg"/></a>
- Tags: ['#openai', '#chef', '#cooking', '#ai', '#machinelearning', '#deeplearning']
- Author: Florent Ravenel
- Author URL: https://www.linkedin.com/in/florent-ravenel/
- Description: This notebook will create a plugin to act as a chef and use OpenAI to create delicious recipes.
- Imports: ['json', 'naas', 'pprint.pprint']
- Plugin: True
- First cell: # OpenAI - Act as a chef
<a href="https://app.naas.ai/user-redirect/naas/downloader?url=https://raw.githubusercontent.com/jupyter-naas/awesome-notebooks/master/OpenAI/OpenAI_Act_as_a_chef.ipynb" target="_parent"><img src="https://naasai-public.s3.eu-west-3.amazonaws.com/open_in_naas.svg"/></a><br><br><a href="https://github.com/jupyter-naas

In [13]:
generated_list = ""
json_templates = []
folder = None

for row in df_notebooks.itertuples():
    notebook_path = row.notebook_path
    print(notebook_path)
    tool = row.tool
    raw_url = row.raw_url
    html_url = row.html_url
    open_in_lab = f"{naas_lab_url}{raw_url}"
    open_in_chat = ""
    
    # Get data from notebook
    title, action, tags, author_name, author_url, description, imports, plugin, first_cell = get_notebook_info(raw_url)
    
    # Create Open in MyChatGPT URL
    if plugin:
        open_in_chat = f"{naas_chat_url}{raw_url}"
        
    # Create OpenButton
    new_first_cell = create_first_cell(
        tool,
        title,
        open_in_lab,
        naas_lab_logo,
        open_in_chat,
        naas_chat_logo,
        template_request,
        bug_report,
        naas_lab_url,
        start_data_product,
    )
    
    # Display open button
#     markdown.display(new_first_cell)

    # Update OpenButton in notebook
    if first_cell != new_first_cell:
        print("Button to be updated:", html_url)
#         update_button()

    # Update json
    new_json = {
        'tool': tool,
        'notebook': title,
        'action': action,
        'tags': tags,
        'author': author_name,
        'author_url': author_url,
        'description':  description,
        "open_in_lab": open_in_lab,
        "open_in_chat": open_in_chat,
        "notebook_url": html_url,
        "imports": imports,
        "updated_at": "",
        "image_url": "",
    }
    json_templates.append(new_json)
    
    # Create markdwon
    new_folder = row.tool
    if new_folder != folder:
        generated_list += f"\n## {new_folder}\n"
        folder = new_folder
    nb_redirect = f"* [{title}]({html_url})\n"
    generated_list += nb_redirect

AWS/AWS_Daily_biling_notification_to_slack.ipynb
AWS/AWS_Get_files_from_S3_bucket.ipynb
AWS/AWS_Read_dataframe_from_S3.ipynb
AWS/AWS_Send_dataframe_to_S3.ipynb
AWS/AWS_Upload_file_to_S3_bucket.ipynb
Abstract%20API/Abstract_API_Check_Email_Validation.ipynb
Abstract%20API/Abstract_API_Get_IP_Geolocation.ipynb
Advertools/Advertools_Analyze_website_content_using_XML_sitemap.ipynb
Advertools/Advertools_Audit_robots_txt_and_xml_sitemap_issues.ipynb
Affinity/Affinity_Sync_with_Notion_database.ipynb
Agicap/Agicap_Export_treasury_plan.ipynb
Agicap/Agicap_Export_treasury_plan_by_account.ipynb
Agicap/Agicap_List_companies.ipynb
Airtable/Airtable_Delete_data.ipynb
Airtable/Airtable_Get_data.ipynb
Airtable/Airtable_Insert_data.ipynb
Airtable/Airtable_Search_data.ipynb
AlphaVantage/AlphaVantage_Get_balance_sheet.ipynb
AlphaVantage/AlphaVantage_Get_cashflow_statement.ipynb
AlphaVantage/AlphaVantage_Get_company_overview.ipynb
AlphaVantage/AlphaVantage_Get_income_statement.ipynb
Azure%20Blob%20Storage/

## Output

### Preview the generated list

In [14]:
markdown.display(generated_list)

### Generate README.md

In [15]:
# Open README template
template = open(readme_template).read()

# Replace var to get list of templates in markdown format
template = template.replace("[[DYNAMIC_LIST]]", generated_list)

# Save README
f  = open(readme, "w+")
f.write(template)
f.close()
print("✅ README updated")

✅ README updated


### Preview json

In [16]:
print("✅ JSON len:", len(json_templates))
pprint(json_templates[0])

✅ JSON len: 752
{'action': '<a '
           'href="https://app.naas.ai/user-redirect/naas/downloader?url=https://raw.githubusercontent.com/jupyter-naas/awesome-notebooks/master/AWS/AWS_Daily_biling_notification_to_slack.ipynb" '
           'target="_parent"><img '
           'src="https://naasai-public.s3.eu-west-3.amazonaws.com/open_in_naas.svg"/></a>',
 'author': 'Maxime Jublou',
 'author_url': 'https://www.linkedin.com/in/maximejublou/',
 'description': 'This notebook sends a daily notification to a Slack channel '
                'with the billing information from an AWS account. It allows '
                'users to easily keep track of their AWS spending.',
 'image_url': '',
 'imports': ['datetime',
             'boto3',
             'naas',
             'dateutil.relativedelta',
             'pandas',
             'naas_drivers'],
 'notebook': 'Daily biling notification to slack',
 'notebook_url': 'https://github.com/jupyter-naas/awesome-notebooks/blob/master/AWS/AWS_Daily_bilin

### Generate json for naas manager & naas search

In [17]:
with open(json_file, 'w') as f:
    json.dump(json_templates, f)
print("✅ JSON file updated")

✅ JSON file updated
