---
title: Tracking Changes
date: 2023-11-30 
authors:
  - name: Sébastien Boisgérault
    email: Sebastien.Boisgerault@minesparis.psl.eu
    url: https://github.com/boisgera
    affiliations:
      - institution: Mines Paris - PSL University
        department: Institut des Transformation Numériques (ITN)
github: boisgera
license: CC-BY-4.0
open_access: true
---

In order to understand how `.tldr` files are structured, we can add a new graphical objects, change some if their properties, etc. and each time we modify the document, analyze the corresponding evolution of the file.

In this notebook, we develop some tooling to help us track such changes.

In [1]:
# Importations nécessaires à cette partie

import difflib
import webbrowser
import inspect
import pprint

## Text comparison

We define two similar versions of the "zen of Python":

In [2]:
zen_1 = """The Zen of Python, by Tim Peters

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Errors should never pass silently.
In the face of ambiguity, refuse the temptation to guess.
There should be one obvious way to do it.
Although that way may not be obvious at first.
Now is better than never.
Although never is often better than right now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it is a good idea.
"""

In [3]:
zen_2 = """\
The Zen of Python, by Tim Peters

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a very good idea.
Namespaces are one honking great idea -- let's do more of those!
"""

```{exercise}
 1. Transform `zen_1` and `zen_2` into list of lines.
 2. Use the [`difflib`](https://docs.python.org/3/library/difflib.html) module of the Python standard library to [`compare`](https://docs.python.org/3/library/difflib.html#difflib.Differ.compare) the two sequences.
 3. Make a text out of the output of compare and print it.
 4. Interpret the result and list the differences between both versions of the zen of Python.
```

In [4]:
# Question 1 : 

zen_1_lines = zen_1.splitlines()
zen_2_lines = zen_2.splitlines()

In [5]:
# Question 2

# Créer un objet Diff pour comparer les deux listes
differ = difflib.Differ()
diff = list(differ.compare(zen_1_lines, zen_2_lines))

diff

['  The Zen of Python, by Tim Peters',
 '  ',
 '  Beautiful is better than ugly.',
 '  Explicit is better than implicit.',
 '  Simple is better than complex.',
 '  Complex is better than complicated.',
 '  Flat is better than nested.',
 '  Sparse is better than dense.',
 '  Readability counts.',
 "  Special cases aren't special enough to break the rules.",
 '+ Although practicality beats purity.',
 '  Errors should never pass silently.',
 '+ Unless explicitly silenced.',
 '  In the face of ambiguity, refuse the temptation to guess.',
 '- There should be one obvious way to do it.',
 '+ There should be one-- and preferably only one --obvious way to do it.',
 '- Although that way may not be obvious at first.',
 "+ Although that way may not be obvious at first unless you're Dutch.",
 '?                                              ++++++++++++++++++++\n',
 '  Now is better than never.',
 '- Although never is often better than right now.',
 '+ Although never is often better than *right* now

In [6]:
# Question 3

# Joindre le résultat de la comparaison en un texte
diff_text = '\n'.join(diff)
print(diff_text)

  The Zen of Python, by Tim Peters
  
  Beautiful is better than ugly.
  Explicit is better than implicit.
  Simple is better than complex.
  Complex is better than complicated.
  Flat is better than nested.
  Sparse is better than dense.
  Readability counts.
  Special cases aren't special enough to break the rules.
+ Although practicality beats purity.
  Errors should never pass silently.
+ Unless explicitly silenced.
  In the face of ambiguity, refuse the temptation to guess.
- There should be one obvious way to do it.
+ There should be one-- and preferably only one --obvious way to do it.
- Although that way may not be obvious at first.
+ Although that way may not be obvious at first unless you're Dutch.
?                                              ++++++++++++++++++++

  Now is better than never.
- Although never is often better than right now.
+ Although never is often better than *right* now.
?                                     +     +

  If the implementation is hard to exp

In [7]:
# Question 4

# Interpréter les différences entre les versions
for line in diff:
    if line.startswith('- '):
        print(f"Ligne supprimée dans zen_2: {line[2:]}")
    elif line.startswith('+ '):
        print(f"Ligne ajoutée dans zen_2: {line[2:]}")

Ligne ajoutée dans zen_2: Although practicality beats purity.
Ligne ajoutée dans zen_2: Unless explicitly silenced.
Ligne supprimée dans zen_2: There should be one obvious way to do it.
Ligne ajoutée dans zen_2: There should be one-- and preferably only one --obvious way to do it.
Ligne supprimée dans zen_2: Although that way may not be obvious at first.
Ligne ajoutée dans zen_2: Although that way may not be obvious at first unless you're Dutch.
Ligne supprimée dans zen_2: Although never is often better than right now.
Ligne ajoutée dans zen_2: Although never is often better than *right* now.
Ligne supprimée dans zen_2: If the implementation is easy to explain, it is a good idea.
Ligne ajoutée dans zen_2: If the implementation is easy to explain, it may be a very good idea.
Ligne ajoutée dans zen_2: Namespaces are one honking great idea -- let's do more of those!


We can make our job easier if we use HTML instead of plain text to visualise the differences between the two texts.


```{exercise}
  1. Use the [HtmlDiff](https://docs.python.org/3/library/difflib.html#difflib.HtmlDiff) class of difflib to produce a `diff.html` file that represents this difference in a HTML document.
  2. Use the [webbrowser](https://docs.python.org/3/library/webbrowser.html) module of the standard library to open it!
  3. Define a `display_diff_text` function that takes two arguments `text_1` and `text_2` and automates steps 1. and 2.
```

In [12]:
def creer_html_diff(texte_1, texte_2, identifiant='', dossier_sortie='docs'):
    
    # Crée un objet HtmlDiff
    html_diff = difflib.HtmlDiff()

    # Obtient la différence entre les deux textes
    contenu_diff = html_diff.make_file(texte_1.splitlines(), texte_2.splitlines())

    # Génère un nom de fichier unique avec identifiant
    nom_fichier = f'diff_{identifiant}.html'

    # Écrit la différence dans le fichier HTML
    chemin_fichier = f'{dossier_sortie}/{nom_fichier}' if dossier_sortie else nom_fichier
    with open(chemin_fichier, 'w', encoding='utf-8') as fichier:
        fichier.write(contenu_diff)

Détails sur la fonction précédente : 
Crée un fichier HTML représentant les différences entre deux textes.
    
    Args:
    - texte_1 (str): Le premier texte à comparer.
    - texte_2 (str): Le deuxième texte à comparer.
    - identifiant (str): Identifiant unique à ajouter au nom du fichier (optionnel).
    - dossier_sortie (str): Nom du dossier de sortie (ici docs).

In [13]:
# Question 2

def ouvrir_fichier_html_dans_navigateur(fichier_html):
    #Args:
    #- fichier_html (str): Nom du fichier HTML à ouvrir.

    webbrowser.open(fichier_html)

In [14]:
def display_diff_text(text_1, text_2, identifiant=''):
    # Compare deux textes, crée un fichier HTML de différences et l'ouvre dans le navigateur.
    
    # Crée le fichier HTML de différences
    creer_html_diff(text_1, text_2, identifiant=identifiant)

    # Obtient le nom du fichier généré
    nom_fichier = f'diff_{identifiant}.html'

    # Ouvre le fichier HTML dans le navigateur web
    ouvrir_fichier_html_dans_navigateur(nom_fichier)

display_diff_text(zen_1, zen_2, identifiant='zen12')

## Comparison of JSON documents

````{exercise} Comparison of dictionnaries

 1. Create a `display_diff` function that takes two Python objects, converts them to strings then leverages `display_diff_text` to display the difference in a browser.

 2. Consider the 3 dictionaries defined by
    ```python
    d1 = {k:k+1 for k in range(100)}
    d2 = d1.copy(); d2[50] = 50
    d3 = {k:k+1 for k in range(99, -1, -1)}
    ```
    `d1` and `d2` have a slight difference and `d1` and `d3` are equal.
    Does your `display_diff` function make easy to spot where the difference is in the first case when it compares `d1` and `d2`?
    Does it make easy to see that `d1` and `d3` are equal?

  3. Investigate the [`pprint`](https://docs.python.org/3/library/pprint.html) module standard library ; use it to improve the behavior of `display_text_diff` in the two cases considered in the previous question.

````
 

In [15]:
# Question 1 : 

def display_diff(obj_1, obj_2):
    #Affiche la différence entre deux objets Python en les convertissant en chaînes de caractères.
    
    # Convertit les objets en chaînes de caractères
    str_1 = str(obj_1)
    str_2 = str(obj_2)

    # Obtient le nom des variables passées en argument
    frame = inspect.currentframe()
    try:
        caller_locals = frame.f_back.f_locals
        variable_names = [name for name, value in caller_locals.items() if value is obj_1 or value is obj_2]
    finally:
        del frame

    # Utilise display_diff_text pour afficher la différence avec l'identifiant spécifié
    a = ''.join(variable_names)
    display_diff_text(str_1, str_2, identifiant=a)

In [16]:
# Question 2 : 

def compare_and_display(d1, d2, d3):
    #Compare trois dictionnaires et affiche la différence dans un navigateur web.

    # Comparaison des dictionnaires d1 et d2
    display_diff(d1, d2)

    # Comparaison des dictionnaires d1 et d3
    display_diff(d1, d3)

# Définition des dictionnaires
d1 = {k: k + 1 for k in range(100)}
d2 = d1.copy()
d2[50] = 50
d3 = {k: k + 1 for k in range(99, -1, -1)}

# Comparaison et affichage des différences
compare_and_display(d1, d2, d3)

Très compliqué de repérer les différences, il y a du rouge et du vert partout, et les dicos sont longs et placés côte 
à côte

In [17]:
# Question 3 : 

def display_diff_pprint(obj_1, obj_2):
    #Affiche la différence entre deux objets Python en les convertissant en chaînes de caractères avec l'aide de pprint.

    # Convertit les objets en chaînes de caractères avec pprint pour une meilleure lisibilité
    str_1 = pprint.pformat(obj_1)
    str_2 = pprint.pformat(obj_2)

    # Obtient le nom des variables passées en argument
    frame = inspect.currentframe()
    try:
        caller_locals = frame.f_back.f_locals
        variable_names = [name for name, value in caller_locals.items() if value is obj_1 or value is obj_2]
    finally:
        del frame

    # Utilise display_diff_text pour afficher la différence avec l'identifiant spécifié
    a = ''.join(variable_names) + 'dic'

    # Utilise display_diff_text_pprint pour afficher la différence avec pprint
    display_diff_text(str_1, str_2,identifiant=a)

# Exemple d'utilisation avec les dictionnaires définis
display_diff_pprint(d1, d2)
display_diff_pprint(d1, d3)

```{exercise} tldraw documents comparator
Implement a function `tldraw_diff` that takes as argument two filenames that refer to tldraw documents and display their differences in the browser.
```

In [18]:
def tldraw_diff(tldraw_1, tldraw_2):
    # Affiche la différence entre deux fichiers Tldraw JSON dans un navigateur web avec pprint.
    
    # Charge les documents JSON
    with open(tldraw_1, 'r', encoding='utf-8') as file:
        tldraw_1 = json.load(file)

    with open(tldraw_2, 'r', encoding='utf-8') as file:
        tldraw_2 = json.load(file)

    # Utilise display_diff_pprint pour afficher la différence avec pprint
    display_diff_pprint(tldraw_1, tldraw_2)