# Moving Notebook Files

Used to move files or folders to different folders within the repo
- preserves files commmit history by using `git mv`
- Will fix relative links within markdown cells of notebooks and within markdown files.
- adds banner at top of file to indicate the files move and status

---
## Setup

Installs:

In [21]:
#!pip install GitPython

Imports:

In [51]:
import os, json, urllib.parse, IPython, pathlib, nbformat, re, git

nbformat.NO_CONVERT

nbformat.NO_CONVERT

## Test For Parsing

In [2]:
urllib.parse.quote_plus('user/repo/path/to the/file/file name.ipynb')

'user%2Frepo%2Fpath%2Fto+the%2Ffile%2Ffile+name.ipynb'

In [3]:
urllib.parse.quote('user/repo/path/to the/file/file name.ipynb')

'user/repo/path/to%20the/file/file%20name.ipynb'

In [4]:
urllib.parse.quote_plus('https://github.com')

'https%3A%2F%2Fgithub.com'

In [5]:
urllib.parse.quote('https://github.com')

'https%3A//github.com'

---
## Locations

`from_path` and `to_path` can be a folder or a specific file.

In [190]:
os.getcwd()

'/home/jupyter/vertex-ai-mlops/Applied GenAI/resources'

In [200]:
repo_path = pathlib.Path('/home/jupyter/vertex-ai-mlops')
from_path = repo_path.joinpath('Applied GenAI/Vertex AI Search')
to_path = repo_path.joinpath('Applied GenAI/legacy/Vertex AI Search')
repo_path, from_path, to_path

(PosixPath('/home/jupyter/vertex-ai-mlops'),
 PosixPath('/home/jupyter/vertex-ai-mlops/Applied GenAI/Vertex AI Search'),
 PosixPath('/home/jupyter/vertex-ai-mlops/Applied GenAI/legacy/Vertex AI Search'))

---
## Move Folder/File

This is a git repository so it is important to move the files with the commit history preserved using `git mv old_file new_file`.

In [205]:
repo = git.Repo(repo_path)

In [206]:
to_path.exists()

True

In [209]:
if from_path.exists():
    repo.git.mv(from_path, to_path)
    print(f'Files moved from: \n\t{from_path}\nto:\n\t{to_path}')
elif to_path.exists():
    print(f'It appears the file(s) have already moved to:\n\t{to_path}')
else:
    print('Make sure the file(s) exists.  Currently not found in the from or to location')

It appears the file(s) have already moved to:
	/home/jupyter/vertex-ai-mlops/Applied GenAI/legacy/Vertex AI Search


The moved files will be staged for commit.  Pause here and commit these changes.

---
## Files List

Create a list of files (.md and .ipynb) including their new full path. If `to_path` was a file then the files list will have just the one file in it.  If `to_path` was a folder then all files in the folder will be included in the list.

In [212]:
def file_list(to_path):
    files = []
    if to_path.is_dir():
        for nb in to_path.glob("*.ipynb"):
            files.append(nb)
        for md in to_path.glob("*.md"):
            files.append(md)   
    elif to_path.is_file() and to_path.suffix in ['.md', '.ipynb']:
        files.append(to_path)
    else:
        print(f'Check for existance of file/folder: {to_path}')

    if files:
        for file in files: print(file)

    return files

In [213]:
files = file_list(to_path)

/home/jupyter/vertex-ai-mlops/Applied GenAI/legacy/Vertex AI Search/Vertex AI Search Python Client Overview.ipynb
/home/jupyter/vertex-ai-mlops/Applied GenAI/legacy/Vertex AI Search/Vertex AI Search For Grounding With Document Q&A.ipynb
/home/jupyter/vertex-ai-mlops/Applied GenAI/legacy/Vertex AI Search/vertex_search_setup.md
/home/jupyter/vertex-ai-mlops/Applied GenAI/legacy/Vertex AI Search/readme.md


---
## Find and Fix Links

In [163]:
def find_relative_links_in_file(file_path, from_path):
    
    relative_links = []
    regex = r"\[.*?\]\((.*?)\)" #capture links in markdown [](link here)
    regex = r"(?:\[.*?\]\((.*?)\)|<\w+\s+[^>]*?(?:href|src)=(['\"])(.*?)\2)" # capture markdown and qouted links in href and src
    if file_path.suffix == '.ipynb':
        nb = nbformat.read(file_path, nbformat.NO_CONVERT)
        for cell in nb.cells:
            if cell.cell_type == 'markdown':
                links = re.findall(regex, cell.source)
                for link in links:
                    link = link[0] or link[2]
                    if not link.startswith("http") and not link.startswith('/'):
                        relative_links.append(
                            (file_path, link, from_path)
                            #link_fixer(file_path, link, from_path)
                        )
    elif file_path.suffix == '.md':
        with open(file_path, "r") as f:
            content = f.read()
            links = re.findall(regex, content)
            for link in links:
                link = link[0] or link[2]
                if not link.startswith('http') and not link.startswith('/'):
                    relative_links.append(
                        (file_path, link, from_path)
                        #link_fixer(file_path, link, from_path)
                    )
    
    return relative_links

In [164]:
relative_links = []
for file_path in files:
    file_path = pathlib.Path(file_path)
    relative_links.extend(
        find_relative_links_in_file(file_path, from_path)
    )

In [166]:
file_path, link, from_path = relative_links[3]
file_path, link, from_path

(PosixPath('/home/jupyter/vertex-ai-mlops/Applied GenAI/legacy/Vertex AI Search/vertex_search_setup.md'),
 '../../architectures/notebooks/applied/genai/vertex_ai_search/vertex_search_step_1.png',
 '/home/jupyter/vertex-ai-mlops/Applied GenAI/Vertex AI Search')

In [167]:
decoded_link = urllib.parse.unquote(link)
decoded_link

'../../architectures/notebooks/applied/genai/vertex_ai_search/vertex_search_step_1.png'

In [135]:
absolute_path = (file_path.parent / decoded_link).resolve()
absolute_path

PosixPath('/home/jupyter/vertex-ai-mlops/Applied GenAI/architectures/notebooks/applied/genai/vertex_ai_search/vertex_search_step_1.png')

In [136]:
absolute_path.exists()

False

In [137]:
old_absolute_path = (pathlib.Path(from_path) / decoded_link).resolve()
old_absolute_path

PosixPath('/home/jupyter/vertex-ai-mlops/architectures/notebooks/applied/genai/vertex_ai_search/vertex_search_step_1.png')

In [141]:
old_absolute_path.exists()

True

In [139]:
common_folder = pathlib.Path(os.path.commonpath([file_path, old_absolute_path]))
common_folder

PosixPath('/home/jupyter/vertex-ai-mlops')

In [147]:
old_relative_to_common = old_absolute_path.relative_to(common_folder)
old_relative_to_common

PosixPath('architectures/notebooks/applied/genai/vertex_ai_search/vertex_search_step_1.png')

In [149]:
new_relative_to_common = file_path.relative_to(common_folder)
new_relative_to_common

PosixPath('Applied GenAI/legacy/Vertex AI Search/vertex_search_setup.md')

In [152]:
new_relative_parts = list(new_relative_to_common.parts)
old_relative_parts = list(old_relative_to_common.parts)
new_relative_parts, old_relative_parts

(['Applied GenAI', 'legacy', 'Vertex AI Search', 'vertex_search_setup.md'],
 ['architectures',
  'notebooks',
  'applied',
  'genai',
  'vertex_ai_search',
  'vertex_search_step_1.png'])

In [156]:
common_parts_length = len(os.path.commonprefix([new_relative_parts, old_relative_parts]))
common_parts_length

0

In [157]:
new_relative_parts[:common_parts_length] = old_relative_parts[:common_parts_length]
new_relative_parts

['Applied GenAI', 'legacy', 'Vertex AI Search', 'vertex_search_setup.md']

In [158]:
new_absolute_path = (common_folder / pathlib.Path(*new_relative_parts)).resolve()
new_absolute_path

PosixPath('/home/jupyter/vertex-ai-mlops/Applied GenAI/legacy/Vertex AI Search/vertex_search_setup.md')

In [159]:
new_absolute_path.exists()

True

In [160]:
new_relative_link = new_absolute_path.relative_to(file_path.parent)
new_relative_link

PosixPath('vertex_search_setup.md')

In [142]:
relative_to_common = old_absolute_path.relative_to(common_folder)
relative_to_common

PosixPath('architectures/notebooks/applied/genai/vertex_ai_search/vertex_search_step_1.png')

In [143]:
file_path.parent

PosixPath('/home/jupyter/vertex-ai-mlops/Applied GenAI/legacy/Vertex AI Search')

In [144]:
new_absolute_path = (file_path.parent / relative_to_common).resolve()
new_absolute_path

PosixPath('/home/jupyter/vertex-ai-mlops/Applied GenAI/legacy/Vertex AI Search/architectures/notebooks/applied/genai/vertex_ai_search/vertex_search_step_1.png')

In [145]:
new_absolute_path.exists()

False

In [85]:
for file_path, link, exists in relative_links:
    print(file_path, '\n', link, '\n', exists, '\n\n')

/home/jupyter/vertex-ai-mlops/Applied GenAI/legacy/Vertex AI Search/Vertex AI Search Python Client Overview.ipynb 
 ./vertex_search_setup.md 
 True 


/home/jupyter/vertex-ai-mlops/Applied GenAI/legacy/Vertex AI Search/Vertex AI Search Python Client Overview.ipynb 
 ./vertex_search_setup.md 
 True 


/home/jupyter/vertex-ai-mlops/Applied GenAI/legacy/Vertex AI Search/vertex_search_setup.md 
 ../../architectures/notebooks/applied/genai/vertex_ai_search/vertex_search_step_0.png 
 False 


/home/jupyter/vertex-ai-mlops/Applied GenAI/legacy/Vertex AI Search/vertex_search_setup.md 
 ../../architectures/notebooks/applied/genai/vertex_ai_search/vertex_search_step_1.png 
 False 


/home/jupyter/vertex-ai-mlops/Applied GenAI/legacy/Vertex AI Search/vertex_search_setup.md 
 ../../architectures/notebooks/applied/genai/vertex_ai_search/vertex_search_step_2.png 
 False 


/home/jupyter/vertex-ai-mlops/Applied GenAI/legacy/Vertex AI Search/vertex_search_setup.md 
 ../../architectures/notebooks/appli