![tracker](https://us-central1-vertex-ai-mlops-369716.cloudfunctions.net/pixel-tracking?path=statmike%2Fvertex-ai-mlops%2Farchitectures&file=move_notebooks.ipynb)
<!--- header table --->
<table align="left">
  <td style="text-align: center">
    <a href="https://colab.research.google.com/github/statmike/vertex-ai-mlops/blob/main/architectures/move_notebooks.ipynb">
      <img src="https://cloud.google.com/ml-engine/images/colab-logo-32px.png" alt="Google Colaboratory logo">
      <br>Run in<br>Colab
    </a>
  </td>
  <td style="text-align: center">
    <a href="https://console.cloud.google.com/vertex-ai/colab/import/https%3A%2F%2Fraw.githubusercontent.com%2Fstatmike%2Fvertex-ai-mlops%2Fmain%2Farchitectures%2Fmove_notebooks.ipynb">
      <img width="32px" src="https://lh3.googleusercontent.com/JmcxdQi-qOpctIvWKgPtrzZdJJK-J3sWE1RsfjZNwshCFgE_9fULcNpuXYTilIR2hjwN" alt="Google Cloud Colab Enterprise logo">
      <br>Run in<br>Colab Enterprise
    </a>
  </td>      
  <td style="text-align: center">
    <a href="https://github.com/statmike/vertex-ai-mlops/blob/main/architectures/move_notebooks.ipynb">
      <img src="https://cloud.google.com/ml-engine/images/github-logo-32px.png" alt="GitHub logo">
      <br>View on<br>GitHub
    </a>
  </td>
  <td style="text-align: center">
    <a href="https://console.cloud.google.com/vertex-ai/workbench/deploy-notebook?download_url=https://raw.githubusercontent.com/statmike/vertex-ai-mlops/main/architectures/move_notebooks.ipynb">
      <img src="https://lh3.googleusercontent.com/UiNooY4LUgW_oTvpsNhPpQzsstV5W8F7rYgxgGBD85cWJoLmrOzhVs_ksK_vgx40SHs7jCqkTkCk=e14-rj-sc0xffffff-h130-w32" alt="Vertex AI logo">
      <br>Open in<br>Vertex AI Workbench
    </a>
  </td>
</table>

# Moving Notebook Files

Used to move files or folders to different folders within the repo
- preserves files commmit history by using `git mv`
- Will fix relative links within markdown cells of notebooks and within markdown files.
- adds banner at top of file to indicate the files move and status

---
## Setup

Installs:

In [21]:
#!pip install GitPython

Imports:

In [1]:
import os, json, urllib.parse, IPython, pathlib, nbformat, re, git

nbformat.NO_CONVERT

nbformat.NO_CONVERT

---
## Locations

`from_path` and `to_path` can be a folder or a specific file.

In [2]:
os.getcwd()

'/home/jupyter/vertex-ai-mlops/Applied GenAI/resources'

In [3]:
repo_path = pathlib.Path('/home/jupyter/vertex-ai-mlops')
from_path = repo_path.joinpath('Applied GenAI/Vertex AI Search')
to_path = repo_path.joinpath('Applied GenAI/legacy/Vertex AI Search')
repo_path, from_path, to_path

(PosixPath('/home/jupyter/vertex-ai-mlops'),
 PosixPath('/home/jupyter/vertex-ai-mlops/Applied GenAI/Vertex AI Search'),
 PosixPath('/home/jupyter/vertex-ai-mlops/Applied GenAI/legacy/Vertex AI Search'))

---
## Move Folder/File

This is a git repository so it is important to move the files with the commit history preserved using `git mv old_file new_file`.

In [4]:
repo = git.Repo(repo_path)

In [5]:
to_path.exists()

True

In [6]:
if from_path.exists():
    repo.git.mv(from_path, to_path)
    print(f'Files moved from: \n\t{from_path}\nto:\n\t{to_path}')
    repo.index.commit('Moved file')
    print(f'Moved files commited.')
elif to_path.exists():
    print(f'It appears the file(s) have already moved to:\n\t{to_path}')
else:
    print('Make sure the file(s) exists.  Currently not found in the from or to location')

It appears the file(s) have already moved to:
	/home/jupyter/vertex-ai-mlops/Applied GenAI/legacy/Vertex AI Search


---
## Files List

Create a list of files (.md and .ipynb) including their new full path. If `to_path` was a file then the files list will have just the one file in it.  If `to_path` was a folder then all files in the folder will be included in the list.

In [7]:
def file_list(from_path, to_path):
    # returns a list of tuples for files that contain (from_file_path, to_file_path)
    files = []
    if to_path.is_dir():
        for nb in to_path.glob("*.ipynb"):
            files.append(
                (
                    from_path.joinpath(nb.name),
                    nb
                )
            )
        for md in to_path.glob("*.md"):
            files.append(
                (
                    from_path.joinpath(md.name),
                    md
                )
            )
    elif to_path.is_file() and to_path.suffix in ['.md', '.ipynb']:
        files.append(
            (
                from_path.joinpath(to_path.name),
                to_path
            )
        )
    else:
        print(f'Check for existance of file/folder: {to_path}')

    return files

In [8]:
files = file_list(from_path, to_path)
files

[(PosixPath('/home/jupyter/vertex-ai-mlops/Applied GenAI/Vertex AI Search/Vertex AI Search Python Client Overview.ipynb'),
  PosixPath('/home/jupyter/vertex-ai-mlops/Applied GenAI/legacy/Vertex AI Search/Vertex AI Search Python Client Overview.ipynb')),
 (PosixPath('/home/jupyter/vertex-ai-mlops/Applied GenAI/Vertex AI Search/vertex_search_setup.md'),
  PosixPath('/home/jupyter/vertex-ai-mlops/Applied GenAI/legacy/Vertex AI Search/vertex_search_setup.md')),
 (PosixPath('/home/jupyter/vertex-ai-mlops/Applied GenAI/Vertex AI Search/readme.md'),
  PosixPath('/home/jupyter/vertex-ai-mlops/Applied GenAI/legacy/Vertex AI Search/readme.md'))]

---
## Find and Fix Links

In [21]:
def link_fixer(from_file_path, to_file_path, link, files):
    # return new_link, a version of link (from_file_path) that is update to work with the new file at to_file_path
    decoded_link = urllib.parse.unquote(link)
    abs_link_path_from = (from_file_path.parent / decoded_link).resolve()
    also_moved = any(ffp[0] == abs_link_path_from for ffp in files)
    try:
        # if the linked file exists or was also moved then continue
        assert abs_link_path_from.exists() or also_moved, f"This link is broken before the move:\n\t{abs_link_path_from}"
        if also_moved:
            new_link = decoded_link
        else:
            common_path_to = pathlib.Path(os.path.commonpath([to_file_path, abs_link_path_from]))
            new_link = (len(to_file_path.parent.parts) - len(common_path_to.parts))*'../' + str(abs_link_path_from.relative_to(common_path_to))
        abs_link_path_to = (to_file_path.parent / new_link).resolve()

        try:
            assert abs_link_path_from == abs_link_path_to or abs_link_path_to.exists(), f"This function failed to fix the link:\n\t{abs_link_path_from}"
            new_link = urllib.parse.quote(new_link)
            return new_link
        except AssertionError as e:
            print(f"Error fixing link: {e}")
    except AssertionError as e:
        print(f"Error fixing link: {e}")
        
def find_relative_links(from_file_path, to_file_path, files):
    # returns a list of tuples for files that contain (from_file_path, to_file_path)
    relative_links = []
    regex = r"(?:\[.*?\]\((.*?)\)|<\w+\s+[^>]*?(?:href|src)=(['\"])(.*?)\2)" # capture markdown and qouted links in href and src
    if to_file_path.suffix == '.ipynb':
        nb = nbformat.read(to_file_path, nbformat.NO_CONVERT)
        for cell in nb.cells:
            if cell.cell_type == 'markdown':
                links = re.findall(regex, cell.source)
                for link in links:
                    link = link[0] or link[2]
                    if not link.startswith("http") and not link.startswith('/'):
                        new_link = link_fixer(from_file_path, to_file_path, link, files)
                        relative_links.append((from_file_path, to_file_path, link, new_link))
                        if new_link and new_link != link:  # Check if the link actually changed 
                            cell.source = cell.source.replace(link, new_link)
        # Save the modified notebook
        nbformat.write(nb, to_file_path)
    elif to_file_path.suffix == '.md':
        with open(to_file_path, "r") as f:
            content = f.read()
            links = re.findall(regex, content)
            for link in links:
                link = link[0] or link[2]
                if not link.startswith('http') and not link.startswith('/'):
                    new_link = link_fixer(from_file_path, to_file_path, link, files)
                    relative_links.append((from_file_path, to_file_path, link, new_link))
                    if new_link and new_link != link:
                        content = content.replace(link, new_link)
            # Save the modified markdown file
            with open(to_file_path, "w") as f:
                f.write(content)
                    
    return relative_links

In [22]:
relative_links = []
for file in files:
    relative_links.extend(
        find_relative_links(*file, files)
    )

---
## Changed Files
Detect which files had changes and stage+commit them:

In [33]:
changed_files = list(set([file[1] for file in relative_links if file[3] and file[3] != file[2]]))
changed_files

[PosixPath('/home/jupyter/vertex-ai-mlops/Applied GenAI/legacy/Vertex AI Search/vertex_search_setup.md')]

In [34]:
for changed_file in changed_files:
    repo.git.add(str(changed_file)) 

In [35]:
repo.index.commit("Fixed relative links after moving this file")

<git.Commit "5906143b98da849bdcb37d4cce1e2e17a8580611">

---
## Add/Edit Banner With Location Change History
Make a section at the top of .md and .ipynb files indicating location changes with dates

---
## Some Checks Using `relative_links`

(from_file_path, to_file_path, link, new_link)

In [36]:
relative_links[0]

(PosixPath('/home/jupyter/vertex-ai-mlops/Applied GenAI/Vertex AI Search/Vertex AI Search Python Client Overview.ipynb'),
 PosixPath('/home/jupyter/vertex-ai-mlops/Applied GenAI/legacy/Vertex AI Search/Vertex AI Search Python Client Overview.ipynb'),
 './vertex_search_setup.md',
 './vertex_search_setup.md')

In [37]:
relative_links[-1]

(PosixPath('/home/jupyter/vertex-ai-mlops/Applied GenAI/Vertex AI Search/readme.md'),
 PosixPath('/home/jupyter/vertex-ai-mlops/Applied GenAI/legacy/Vertex AI Search/readme.md'),
 './Vertex%20AI%20Search%20Python%20Client%20Overview.ipynb',
 './Vertex%20AI%20Search%20Python%20Client%20Overview.ipynb')

In [38]:
len(relative_links)

14

In [40]:
for rl in relative_links:
    print(rl[3], rl[3]!=rl[2])

./vertex_search_setup.md False
./vertex_search_setup.md False
../../../architectures/notebooks/applied/genai/vertex_ai_search/vertex_search_step_0.png True
../../../architectures/notebooks/applied/genai/vertex_ai_search/vertex_search_step_1.png True
../../../architectures/notebooks/applied/genai/vertex_ai_search/vertex_search_step_2.png True
../../../architectures/notebooks/applied/genai/vertex_ai_search/vertex_search_step_3.png True
../../../architectures/notebooks/applied/genai/vertex_ai_search/vertex_search_step_4.png True
../../../architectures/notebooks/applied/genai/vertex_ai_search/vertex_search_step_5.png True
../../../architectures/notebooks/applied/genai/vertex_ai_search/vertex_search_step_6.png True
../../../architectures/notebooks/applied/genai/vertex_ai_search/vertex_search_step_7.png True
../../../architectures/notebooks/applied/genai/vertex_ai_search/vertex_search_step_8.png True
../../../architectures/notebooks/applied/genai/vertex_ai_search/vertex_search_step_9.png Tru

- X handle the relative links to files that were also moved
- X check to see if link is already updated (in case it was run previously)
- X update the relative link and save the file
- X keep list of changed files, comitt these with message
- add/update a header cell/section(md) that states file move history: date, from, to
    - commit these with message
- remember to update headers
- check all other files for links to moved files
    - update these
    - commit changes
    
