# Moving Notebook Files

Used to move files or folders to different folders within the repo
- preserves files commmit history by using `git mv`
- Will fix relative links within markdown cells of notebooks and within markdown files.
- adds banner at top of file to indicate the files move and status

---
## Setup

Installs:

In [21]:
#!pip install GitPython

Imports:

In [51]:
import os, json, urllib.parse, IPython, pathlib, nbformat, re, git

nbformat.NO_CONVERT

nbformat.NO_CONVERT

## Test For Parsing

In [2]:
urllib.parse.quote_plus('user/repo/path/to the/file/file name.ipynb')

'user%2Frepo%2Fpath%2Fto+the%2Ffile%2Ffile+name.ipynb'

In [3]:
urllib.parse.quote('user/repo/path/to the/file/file name.ipynb')

'user/repo/path/to%20the/file/file%20name.ipynb'

In [4]:
urllib.parse.quote_plus('https://github.com')

'https%3A%2F%2Fgithub.com'

In [5]:
urllib.parse.quote('https://github.com')

'https%3A//github.com'

---
## Locations

In [7]:
os.getcwd()

'/home/jupyter/vertex-ai-mlops/Applied GenAI/resources'

In [96]:
repo_path = '/home/jupyter/vertex-ai-mlops'
from_path = repo_path + '/Applied GenAI/Vertex AI Search'
to_path = repo_path + '/Applied GenAI/legacy/Vertex AI Search'

files = [
    "Summarize Conversations - Text and Audio.ipynb",
    "Vertex AI GenAI For Document Q&A - USGA Rules For Golf.ipynb",
    "Vertex AI GenAI For Document Q&A v2 - MLB Rules For Baseball.ipynb",
    "Vertex AI GenAI For Document Q&A - NHL Rules For Hockey.ipynb",
    "Vertex AI GenAI For BigQuery Metadata - Make Better Tables.ipynb",
    "Vertex AI GenAI For Document Q&A - Healthcare Benefits Member Handbook.ipynb",
    "Vertex AI GenAI For Document Q&A - MCC Laws For Cricket.ipynb",
    "Vertex AI Matching Engine For Document Q&A.ipynb",
    "Vertex AI GenAI For Rewriting - BigQuery Advisor With Codey.ipynb",
    "Vertex AI GenAI For Document Q&A - Annual Report.ipynb",
    "Vertex AI GenAI For Document Q&A v2 - Deed Of Trust.ipynb",
    "Vertex AI GenAI For Document Q&A - FAA Regulations.ipynb",
    "Vertex AI GenAI For BigQuery Q&A - Overview.ipynb",
    "Vertex AI GenAI For Document Q&A - Local Government Trends.ipynb",
    "Vertex AI GenAI For Document Q&A v2 - Employee Handbook.ipynb",
    "Vertex AI GenAI For Document Q&A - MLB Rules For Baseball.ipynb",
    "Vertex AI GenAI For Document Q&A - NFL Rules For Football.ipynb",
    "Vertex AI GenAI For Document Q&A - NBA Rules For Basketball.ipynb",
    "Vertex AI GenAI For Document Q&A - IFAB Laws For Soccer.ipynb",
    "Vertex AI GenAI For Document Q&A - Municipal Securities.ipynb",
]

---
## Move Folder/File

This is a git repository so it is important to move the files with the commit history preserved using `git mv old_file new_file`.

In [97]:
repo = git.Repo(repo_path)

In [100]:
if pathlib.Path(from_path).exists():
    repo.git.mv(from_path, to_path)
    print(f'Files moved from: \n\t{from_path}\nto:\n\t{to_path}')
else:
    print('Files in `from_path` do not exists, likely already moved.')

Files in `from_path` do not exists, likely already moved.


The moved files will be staged for commit.  Pause here and commit these changes.

---
## Files List

Create a list of files (.md and .ipynb) including their new full path. If `to_path` was a file then the files list will have just the one file in it.  If `to_path` was a folder then all files in the folder will be included in the list.

In [101]:
def file_list(to_path):
    files = []
    if os.path.isdir(to_path):
        for nb in pathlib.Path(to_path).glob("*.ipynb"):
            files.append(to_path+'/'+nb.name)
        for md in pathlib.Path(to_path).glob("*.md"):
            files.append(to_path+'/'+md.name)   
    elif os.path.isfile(to_path):
        files.append(to_path)
    else:
        print(f'Check for existance of file/folder: {to_path}')

    if files:
        for file in files: print(file)

    return files

In [102]:
files = file_list(to_path)

/home/jupyter/vertex-ai-mlops/Applied GenAI/legacy/Vertex AI Search/Vertex AI Search Python Client Overview.ipynb
/home/jupyter/vertex-ai-mlops/Applied GenAI/legacy/Vertex AI Search/Vertex AI Search For Grounding With Document Q&A.ipynb
/home/jupyter/vertex-ai-mlops/Applied GenAI/legacy/Vertex AI Search/vertex_search_setup.md
/home/jupyter/vertex-ai-mlops/Applied GenAI/legacy/Vertex AI Search/readme.md


---
## Find and Fix Links

In [105]:
def link_fixer(file_path, link, from_path):
    decoded_link = urllib.parse.unquote(link)
    absolute_path = (file_path.parent / decoded_link).resolve()
    print(file_path.parent, decoded_link, absolute_path)
    if not absolute_path.exists():
        old_absolute_path = (pathlib.Path(from_path) / decoded_link).resolve()
        #if old_absolute_path.exists():
        new_relative_link = old_absolute_path.relative_to(file_path.parent)
        new_absolute_path = (file_path.parent / new_relative_link).resolve()
            
    
        response = (
            file_path,
            decoded_link,
            absolute_path.exists(),
            new_relative_link,
            new_absolute_path.exists()
        )
    else:
        response = (
            file_path,
            decoded_link,
            absolute_path.exists(),
            )
    return response


def find_relative_links_in_file(file_path, from_path):
    
    relative_links = []
    regex = r"\[.*?\]\((.*?)\)" #capture links in markdown [](link here)
    regex = r"(?:\[.*?\]\((.*?)\)|<\w+\s+[^>]*?(?:href|src)=(['\"])(.*?)\2)" # capture markdown and qouted links in href and src
    if file_path.suffix == '.ipynb':
        nb = nbformat.read(file_path, nbformat.NO_CONVERT)
        for cell in nb.cells:
            if cell.cell_type == 'markdown':
                links = re.findall(regex, cell.source)
                for match in links:
                    #print(match)
                    link = match[0] or match[2]
                    if not link.startswith("http") and not link.startswith('/'):
                        relative_links.append(
                            link_fixer(file_path, link, from_path)
                        )
    elif file_path.suffix == '.md':
        with open(file_path, "r") as f:
            content = f.read()
            links = re.findall(regex, content)
            for match in links:
                #print(match)
                link = match[0] or match[2]
                if not link.startswith('http') and not link.startswith('/'):
                    relative_links.append(
                        link_fixer(file_path, link, from_path)
                    )
    
    return relative_links

In [106]:
relative_links = []
for file_path in files:
    file_path = pathlib.Path(file_path)
    relative_links.extend(
        find_relative_links_in_file(file_path, from_path)
    )

/home/jupyter/vertex-ai-mlops/Applied GenAI/legacy/Vertex AI Search ./vertex_search_setup.md /home/jupyter/vertex-ai-mlops/Applied GenAI/legacy/Vertex AI Search/vertex_search_setup.md
/home/jupyter/vertex-ai-mlops/Applied GenAI/legacy/Vertex AI Search ./vertex_search_setup.md /home/jupyter/vertex-ai-mlops/Applied GenAI/legacy/Vertex AI Search/vertex_search_setup.md
/home/jupyter/vertex-ai-mlops/Applied GenAI/legacy/Vertex AI Search ../../architectures/notebooks/applied/genai/vertex_ai_search/vertex_search_step_0.png /home/jupyter/vertex-ai-mlops/Applied GenAI/architectures/notebooks/applied/genai/vertex_ai_search/vertex_search_step_0.png


ValueError: '/home/jupyter/vertex-ai-mlops/architectures/notebooks/applied/genai/vertex_ai_search/vertex_search_step_0.png' is not in the subpath of '/home/jupyter/vertex-ai-mlops/Applied GenAI/legacy/Vertex AI Search' OR one path is relative and the other is absolute.

In [85]:
for file_path, link, exists in relative_links:
    print(file_path, '\n', link, '\n', exists, '\n\n')

/home/jupyter/vertex-ai-mlops/Applied GenAI/legacy/Vertex AI Search/Vertex AI Search Python Client Overview.ipynb 
 ./vertex_search_setup.md 
 True 


/home/jupyter/vertex-ai-mlops/Applied GenAI/legacy/Vertex AI Search/Vertex AI Search Python Client Overview.ipynb 
 ./vertex_search_setup.md 
 True 


/home/jupyter/vertex-ai-mlops/Applied GenAI/legacy/Vertex AI Search/vertex_search_setup.md 
 ../../architectures/notebooks/applied/genai/vertex_ai_search/vertex_search_step_0.png 
 False 


/home/jupyter/vertex-ai-mlops/Applied GenAI/legacy/Vertex AI Search/vertex_search_setup.md 
 ../../architectures/notebooks/applied/genai/vertex_ai_search/vertex_search_step_1.png 
 False 


/home/jupyter/vertex-ai-mlops/Applied GenAI/legacy/Vertex AI Search/vertex_search_setup.md 
 ../../architectures/notebooks/applied/genai/vertex_ai_search/vertex_search_step_2.png 
 False 


/home/jupyter/vertex-ai-mlops/Applied GenAI/legacy/Vertex AI Search/vertex_search_setup.md 
 ../../architectures/notebooks/appli