Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dependency management #14

Merged
merged 43 commits into from
Jul 26, 2022
Merged

Conversation

mweidling
Copy link
Contributor

@mweidling mweidling commented Jul 1, 2022

Description

This PR introduces two new attributes to the Repo class: dependencies and dependency_conflicts. They have the following structure (example shows the output for cor-asv-ann) in repos.json:

"dependencies": {
    "Keras": "2.3.1",
    "Keras-Applications": "1.0.8",
    "Keras-Preprocessing": "1.1.2",
    "Markdown": "3.3.7",
    "absl-py": "1.1.0",
    "astor": "0.8.1",
    "cycler": "0.11.0",
    "editdistance": "0.6.0",
    "fonttools": "4.33.3",
    "gast": "0.2.2",
    "google-pasta": "0.2.0",
    "grpcio": "1.47.0",
    "h5py": "2.10.0",
    "kiwisolver": "1.4.3",
    "matplotlib": "3.5.2",
    "numpy": "1.18.5",
    "ocrd-cor-asv-ann": "ocrd-cor-asv-ann",
    "opt-einsum": "3.3.0",
    "packaging": "21.3",
    "protobuf": "4.21.2",
    "pyparsing": "3.0.9",
    "python-dateutil": "2.8.2",
    "scipy": "1.7.3",
    "six": "1.16.0",
    "tensorboard": "1.15.0",
    "tensorflow-estimator": "1.15.1",
    "tensorflow-gpu": "1.15.5",
    "termcolor": "1.1.0"
},
"dependency_conflicts": {
    "absl-py": {
        "cor-asv-ann": "1.1.0",
        "eynollah": "1.1.0",
        "ocrd_anybaseocr": "1.1.0",
        "ocrd_calamari": "1.1.0",
        "ocrd_keraslm": "1.1.0",
        "ocrd_kraken": "1.1.0",
        "ocrd_pc_segmentation": "0.15.0",
        "sbb_binarization": "1.1.0"
    },
    "h5py": {
        "cor-asv-ann": "2.10.0",
        "eynollah": "3.7.0",
        "ocrd_anybaseocr": "3.7.0",
        "ocrd_calamari": "3.7.0",
        "ocrd_keraslm": "2.10.0",
        "ocrd_pc_segmentation": "3.1.0",
        "sbb_binarization": "3.7.0"
    },
    "protobuf": {
        "cor-asv-ann": "4.21.2",
        "eynollah": "3.19.4",
        "ocrd_anybaseocr": "3.19.4",
        "ocrd_calamari": "3.19.4",
        "ocrd_keraslm": "4.21.2",
        "ocrd_kraken": "3.19.4",
        "ocrd_pc_segmentation": "3.19.4",
        "sbb_binarization": "3.19.4"
    },
    "tensorboard": {
        "cor-asv-ann": "1.15.0",
        "eynollah": "2.9.1",
        "ocrd_anybaseocr": "2.9.1",
        "ocrd_calamari": "2.9.1",
        "ocrd_keraslm": "1.15.0",
        "ocrd_kraken": "2.9.1",
        "ocrd_pc_segmentation": "2.9.1",
        "sbb_binarization": "2.9.1"
    },
    "tensorflow-estimator": {
        "cor-asv-ann": "1.15.1",
        "eynollah": "2.9.0",
        "ocrd_anybaseocr": "2.9.0",
        "ocrd_calamari": "2.9.0",
        "ocrd_keraslm": "1.15.1",
        "ocrd_pc_segmentation": "2.5.0",
        "sbb_binarization": "2.9.0"
    }
}

The output stated is generated by

  • creating a venv for every submodule in ocrd_all (except opencv-python and tesseract since they aren't really part of OCR-D) and installing it
  • retrieving the installed version with pip freeze -l

As output we create two different files: deps.json and dep_conflicts.json. The former lists all dependencies per OCR-D project while the latter makes transparent which packages have been installed by several OCR-D projects, but in different versions. In all cases the dependencies given in OCR-D/core are omitted because we assume that most OCR-D projects based on Python use this. Both files mentioned above are auxiliary files used by the Repo class and will be updated on demand (TODO).

Repo.dependencies shows a full list of all dependencies. There is no use case for this information yet, so we might decide to toss it.
Repo.dependency_conflicts is a result of recognizing which projects have a dependency installed in different major versions; We rely on packages to implement semantic versioning correctly and assume that different major versions mean that there are breaking changes between the two versions. Cases where two or more projects have installed the same package in different minor or patch versions are ignored.

How to test it

  • check out branch dependency-management
  • run make repos.json
  • wait for it
  • see resulting repos.json

Closes #11.

@mweidling mweidling requested review from kba and paulpestov July 1, 2022 13:58
@mweidling mweidling self-assigned this Jul 1, 2022
@mweidling
Copy link
Contributor Author

@paulpestov Could you please give me feedback regarding the data structure?

@paulpestov
Copy link

I think the structure is alright. Maybe one thing, could we switch the key/value of version/project, so the frontend can recognize the affected projects easier:

{
 "tensorflow": "1.0.0"
}

@mweidling
Copy link
Contributor Author

I think the structure is alright. Maybe one thing, could we switch the key/value of version/project, so the frontend can recognize the affected projects easier:

{
 "tensorflow": "1.0.0"
}

You mean within dependency_conflicts? Sure!

dependencies.sh Outdated Show resolved Hide resolved
dependencies.sh Outdated Show resolved Hide resolved
dependencies.sh Outdated Show resolved Hide resolved
dependencies.sh Outdated Show resolved Hide resolved
dependencies.sh Outdated Show resolved Hide resolved
@mweidling mweidling merged commit 7471d30 into OCR-D:main Jul 26, 2022
@mweidling mweidling deleted the dependency-management branch July 26, 2022 06:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Projects tabs: concept for dependencies
3 participants