Skip to content

The training project "Difference Generator" on the Python Development course on Hexlet.io

Notifications You must be signed in to change notification settings

IgorGakhov/Difference-Generator-py

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

39 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Difference Generator


The training project "Difference Generator" on the Python Development course on Hexlet.io.

Actions Status linter-and-tests-check Maintainability Test Coverage

Built With

Languages, frameworks and libraries used in the implementation of the project:

Dependencies

List of dependencies, without which the project code will not work correctly:

  • python = "^3.8"
  • pyyaml = "^6.0"

Description

Difference Generator is a program that determines the difference between two data structures. This is a popular task for which there are many online services, for example: http://www.jsondiff.com/. A similar mechanism is used when outputting tests or when automatically tracking changes in configuration files.

The main question in the project: how to describe the internal representation of the diff between the files, so that it is as convenient as possible. Although there are many different ways to do this, only a few of them lead to simple code.

Working with trees and tree recursion is very good at pumping algorithmic thinking. This is important because real-world processing involves constant data processing, various transformations, and collection output.

To build a diff between two structures, many operations have to be done: reading files, parsing incoming data, building a tree of differences, and generating the necessary output.

Utility features:

  • Suppported file formats: YAML, JSON.
  • Report generation as plain text, structured text or JSON.
  • Can be used as CLI tool or external library.

Summary


Installation

Python

Before installing the package, you need to make sure that you have Python version 3.8 or higher installed:

# Windows, Ubuntu, MacOS:
>> python --version # or python -V
Python 3.8.0+

⚠️ If a command without a version does not work, specify the Python version explicitly: python3 --version.

If you have an older version installed, update with the following commands:

# Windows:
>> pip install python --upgrade

# Ubuntu:
>> sudo apt-get upgrade python3.X

# MacOS:
>> brew update && brew upgrade python

# * X - version number to be installed

If you don't have Python installed, you can download and install it from the official Python website. If you are an Ubuntu or MacOS user, then it is better to do this procedure through package managers. Open a terminal and run the command for your operating system:

# Ubuntu:
>> sudo apt update
>> sudo apt install python3.X

# MacOS:
# https://brew.sh/index_ru.html
>> brew install python3.X

# * X - version number to be installed

❗ The configuration of assemblies of different versions of operating systems can vary greatly from each other, which makes it impossible to write a common instruction. If you're running an OS other than the above, or you're having errors after the suggested commands, search Stack Overflow for answers, maybe someone else has come across them before you! Setting up the environment is not easy! πŸ™‚

Poetry

The project uses the Poetry manager. Poetry is a tool for dependency management and packaging in Python. It allows you to declare the libraries your project depends on and it will manage (install/update) them for you. You can read more about this tool on the official Poetry website.

Poetry provides a custom installer that will install poetry isolated from the rest of your system by vendorizing its dependencies. This is the recommended way of installing poetry.

# Windows (WSL), Linux, MacOS:
>> curl -sSL https://install.python-poetry.org | python3 -

# Windows (Powershell):
>> (Invoke-WebRequest -Uri https://install.python-poetry.org -UseBasicParsing).Content | py -
# If you have installed Python through the Microsoft Store, replace "py" with "python" in the command above.

⚠️ On some systems, python may still refer to Python 2 instead of Python 3. The Poetry Team suggests a python3 binary to avoid ambiguity.

⚠️ By default, Poetry is installed into a platform and user-specific directory:

  • ~/Library/Application Support/pypoetry on MacOS.
  • ~/.local/share/pypoetry on Linux/Unix.
  • %APPDATA%\pypoetry on Windows.

If you wish to change this, you may define the $POETRY_HOME environment variable:

>> curl -sSL https://install.python-poetry.org | POETRY_HOME=/etc/poetry python3 -

Add Poetry to your PATH.

Once Poetry is installed and in your $PATH, you can execute the following:

>> poetry --version

Project package

To work with the package, you need to clone the repository to your computer. This is done using the git clone command. Clone the project on the command line:

# clone via HTTPS:
>> git clone https://github.com/IgorGakhov/python-project-lvl2.git

# clone via SSH:
>> git clone git@github.com:IgorGakhov/python-project-lvl2.git

It remains to move to the directory and install the package:

>> cd python-project-lvl2
>> poetry build
>> python3 -m pip install --user dist/*.whl
# If you have previously installed a package and want to update it, use the following command:
# >> python3 -m pip install --user --force-reinstall dist/*.whl

Finally, we can move on to using the project functionality!


Usage

As external library

from gendiff import generate_diff
diff = generate_diff(file_path1, file_path2)

As CLI tool

Help

The utility provides the ability to call the help command if you find it difficult to use:

>> gendiff --help
usage: gendiff [-h] [-f {stylish,json,plain}] first_file second_file

Compares two configuration files and shows a difference.

positional arguments:
  first_file
  second_file

options:
  -h, --help            show this help message and exit
  -f {stylish,json,plain}, --format {stylish,json,plain}
                        set format of output (default: stylish)

asciicast

Demo

⚑ Both absolute and relative paths to files are supported.

πŸ“Œ Stylish format

If format option is omitted, output will be in stylish format string by default.

The diff is built based on how the files have changed relative to each other, the keys are displayed in alphabetical order.

The absence of a plus or minus indicates that the key is in both files, and its values are the same. In all other situations, the key value is either different, or the key is in only one file.

Example:

>> gendiff filepath1.json filepath2.json
{
  - follow: false
    host: hexlet.io
  - proxy: 123.234.53.22
  - timeout: 50
  + timeout: 20
  + verbose: true
}
Compare two flat JSON and/or YAML files: stylish format

asciicast

Compare two nested JSON and/or YAML files: stylish format

asciicast

πŸ“Œ Plain format

The text reflects the situation, as if we have combined the second object with the first.

  • If the new property value is complex, then [complex value] is written.
  • If the property is nested, then the entire path to the root is displayed, and not just taking into account the parent.

Example:

>> gendiff --format plain filepath1.json filepath2.json
Property 'follow' was removed
Property 'proxy' was removed
Property 'timeout' was updated. From 50 to 20
Property 'verbose' was added with value: true
Compare two flat JSON and/or YAML files: plain format

asciicast

Compare two nested JSON and/or YAML files: plain format

asciicast

πŸ“Œ JSON format

JSON (JavaScript Object Notation) is a standard text format for representing structured data based on JavaScript object syntax. It is usually used to transfer data in web applications (e.g. sending some data from the server to the client so that it can be displayed on a web page or vice versa).

Example:

>> gendiff --format json filepath1.json filepath2.json
{
    "follow": {
        "value": false,
        "node type": "REMOVED"
    },
    "host": {
        "value": "hexlet.io",
        "node type": "UNCHANGED"
    },
    "proxy": {
        "value": "123.234.53.22",
        "node type": "REMOVED"
    },
    "timeout": {
        "value": {
            "old": 50,
            "new": 20
        },
        "node type": "UPDATED"
    },
    "verbose": {
        "value": true,
        "node type": "ADDED"
    }
}

Node types:

  • "ADDED": key was not present in the first file, but was present in the second file.
  • "REMOVED": key was present in the first file, but not present in the second file.
  • "UNCHANGED": key exists in both files and its values match.
  • "UPDATED": key exists in both files, but its values do not match.
  • "NESTED": similar to 'updated', but here the values are dictionaries.
Compare two flat JSON and/or YAML files: JSON format

asciicast

Compare two nested JSON and/or YAML files: JSON format

asciicast


Development

Dev Dependencies

List of dev-dependencies:

  • flake8 = "^4.0.1"
  • pytest = "^7.1.2"
  • pytest-cov = "^3.0.0"

Project Organization

.
β”œβ”€β”€ gendiff
β”‚Β Β  β”œβ”€β”€ __init__.py
β”‚Β Β  β”œβ”€β”€ cli.py
β”‚Β Β  β”œβ”€β”€ file_processor
β”‚Β Β  β”‚Β Β  β”œβ”€β”€ __init__.py
β”‚Β Β  β”‚Β Β  β”œβ”€β”€ gendiff.py
β”‚Β Β  β”‚Β Β  β”œβ”€β”€ file_handler.py
β”‚Β Β  β”‚Β Β  β”œβ”€β”€ data_loader.py
β”‚Β Β  β”‚Β Β  └── diff_assembler.py
β”‚Β Β  β”œβ”€β”€ formatters
β”‚Β Β  β”‚Β Β  β”œβ”€β”€ __init__.py
β”‚Β Β  β”‚Β Β  β”œβ”€β”€ tree_render.py
β”‚Β Β  β”‚Β Β  β”œβ”€β”€ stylish.py
β”‚Β Β  β”‚Β Β  β”œβ”€β”€ plain.py
β”‚Β Β  β”‚Β Β  └── json.py
β”‚Β Β  └── scripts
β”‚Β Β      β”œβ”€β”€ __init__.py
β”‚Β Β      └── run.py
β”œβ”€β”€ tests
β”‚   β”œβ”€β”€ fixtures
β”‚   β”‚Β Β  β”œβ”€β”€ diff_requests
β”‚   β”‚Β Β  └── diff_responses
β”‚   β”œβ”€β”€ test_cli.py
β”‚   └── test_gendiff.py
β”œβ”€β”€ Makefile
β”œβ”€β”€ pyproject.toml
β”œβ”€β”€ README.md
└── setup.cfg

Useful commands

The commands most used in development are listed in the Makefile:

make package-install
Installing a package in the user environment.
make build
Building the distribution of he Poetry package.
make package-force-reinstall
Reinstalling the package in the user environment.
make lint
Checking code with linter.
make test
Tests the code.
make fast-check
Builds the distribution, reinstalls it in the user's environment, checks the code with tests and linter.

Thank you for attention!

πŸ‘¨β€πŸ’» Author: @IgorGakhov