Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CI: Transition from s4ext to json #2011

Merged
merged 7 commits into from
Apr 24, 2024
Merged

Conversation

jcfr
Copy link
Member

@jcfr jcfr commented Mar 12, 2024

In response to discussions during the Slicer developer hangout on March 12th, we have decided to revamp the organization of extension metadata in the extensions index, transitioning from the s4ext format to json.

Motivation:

  • Eliminate redundant and unused information from the "description" file.
  • Simplify programmatic parsing of the "description" files
  • Decouple the metadata organized in the extension CMakeLists.txt from the ones organized in this repository and used to drive the build of extensions.
  • Enable Slicer maintainers to define and update the extension category independently1 of the upstream extension sources
  • Prepare for the integration of the upcoming "tier" metadata.2

Related issues:

Related pull requests:

Next steps:

cc: @lassoan @sjh26 @RafaelPalomar @mauigna06 @pieper

Footnotes

  1. The EXTENSION_CATEGORY variable currently set in the CMakeLists.txt will likely be deprecated

  2. This is described on the Slice Discourse forum: https://discourse.slicer.org/t/introduction-of-tiers-for-slicer-extensions/34870

@jcfr jcfr force-pushed the transition-from-s4ext-to-json branch 2 times, most recently from 930b93b to edfaab2 Compare March 12, 2024 20:32
@jcfr
Copy link
Member Author

jcfr commented Mar 12, 2024

For reference, the following script was used to convert from s4ext to json:

import json
import sys
from pathlib import Path

extensions_index_dir = Path("/home/jcfr/Projects/ExtensionsIndex")
updated_extensions_index_dir = extensions_index_dir


def parse_s4ext(ext_file_path):
    """Parse a Slicer extension description file.
    :param ext_file_path: Path to a Slicer extension description file.
    """
    ext_metadata = {}
    with open(ext_file_path) as ext_file:
        for line in ext_file:
            if not line.strip() or line.startswith("#"):
                continue
            fields = [field.strip() for field in line.split(' ', 1)]
            assert(len(fields) <= 2)
            ext_metadata[fields[0]] = fields[1] if len(fields) == 2 else None
    return ext_metadata


# Collect s4ext files
s4ext_filepaths = list(extensions_index_dir.glob("*.s4ext"))

print(f"Found {len(s4ext_filepaths)} extension files (.s4ext)")

# Parse s4ext files and generate corresponding json files
for index, filepath in enumerate(s4ext_filepaths):

    metadata = parse_s4ext(filepath)
    #print("filepath", filepath)
    updated_metadata = {
        "scmurl": metadata["scmurl"],
        "scmrevision": metadata["scmrevision"],
        "build_dependencies": [] if metadata.get("depends", "NA") == "NA" else metadata["depends"].split(" "),
        "category": metadata["category"],
        "build_subdirectory": metadata["build_subdirectory"],
    }

    with open(updated_extensions_index_dir / f"{filepath.stem}.json", 'w') as fileContents:
        fileContents.write(json.dumps(updated_metadata, sort_keys=True, indent=2))
        fileContents.write("\n")


print(f"Generated {index + 1} extension files (.json)")

from pprint import pprint as pp

print(f"\nMetadata of extension #{index + 1} ({filepath.stem}):\n")
pp(updated_metadata)

@jcfr
Copy link
Member Author

jcfr commented Mar 19, 2024

@lassoan This is ready for review

@jcfr jcfr requested a review from lassoan March 19, 2024 22:42
.pre-commit-config.yaml Outdated Show resolved Hide resolved
Copy link
Contributor

@lassoan lassoan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, it looks good to me.

I don't really like the inconsistent naming convention (build_dependencies uses _ separator, while scmrevision does not use separator), but if you have a good reason for this then I can live with it.

@jcfr
Copy link
Member Author

jcfr commented Apr 22, 2024

Extension Updates

List of pull requests intended to fix and consolidate extension metadata:

@jcfr jcfr force-pushed the transition-from-s4ext-to-json branch 3 times, most recently from b0679b9 to b3566f6 Compare April 23, 2024 17:14
A script like the following was used. Note that "pre-commit run -a" was
also executed to reformat the json files afterward.

```
import json
import sys
from pathlib import Path

extensions_index_dir = Path("/home/jcfr/Projects/ExtensionsIndex")
updated_extensions_index_dir = extensions_index_dir

def parse_s4ext(ext_file_path):
    """Parse a Slicer extension description file.
    :param ext_file_path: Path to a Slicer extension description file.
    """
    ext_metadata = {}
    with open(ext_file_path) as ext_file:
        for line in ext_file:
            if not line.strip() or line.startswith("#"):
                continue
            fields = [field.strip() for field in line.split(' ', 1)]
            assert(len(fields) <= 2)
            ext_metadata[fields[0]] = fields[1] if len(fields) == 2 else None
    return ext_metadata

s4ext_filepaths = list(extensions_index_dir.glob("*.s4ext"))

print(f"Found {len(s4ext_filepaths)} extension files (.s4ext)")

for index, filepath in enumerate(s4ext_filepaths):

    metadata = parse_s4ext(filepath)
    updated_metadata = {
        "scm_url": metadata["scmurl"],
        "scm_revision": metadata["scmrevision"],
        "build_dependencies": [] if metadata.get("depends", "NA") == "NA" else metadata["depends"].split(" "),
        "category": metadata["category"],
        "build_subdirectory": metadata["build_subdirectory"],
    }

    with open(updated_extensions_index_dir / f"{filepath.stem}.json", 'w') as fileContents:
        fileContents.write(json.dumps(updated_metadata, sort_keys=True, indent=2))
        fileContents.write("\n")

print(f"Generated {index + 1} extension files (.json)")

from pprint import pprint as pp

print(f"\nMetadata of extension #{index + 1} ({filepath.stem}):\n")
pp(updated_metadata)
```
Update GitHub and CircleCI settings to check for json files.

Update `check_description_files` CLI to parse json files:
* Remove `parse_s4ext` and introduce `parse_json` function along with
  `ExtensionParseError` exception.
* Update list of supported URL schemes. Only `https` and `git` are supported
* Remove obsolete `check_scm_notlocal` check
A script like the following was used. Note that "pre-commit run -a" was
also executed to reformat the json files afterward.

```
import json
import sys
from pathlib import Path

extensions_index_dir = Path("/home/jcfr/Projects/ExtensionsIndex")

json_filepaths = list(extensions_index_dir.glob("*.json"))
print(f"Found {len(json_filepaths)} extension files (.json)")

for index, filepath in enumerate(json_filepaths):
    with open(filepath) as fileContents:
        metadata = json.load(fileContents)
        metadata["$schema"] = "https://raw.githubusercontent.com/Slicer/Slicer/main/Schemas/slicer-extension-catalog-entry-schema-v1.0.0.json#"

    with open(extensions_index_dir / filepath, 'w') as fileContents:
        fileContents.write(json.dumps(metadata, sort_keys=True, indent=2))
        fileContents.write("\n")
```
@jcfr jcfr force-pushed the transition-from-s4ext-to-json branch from 4928e97 to 2a00294 Compare April 24, 2024 03:30
@jcfr jcfr merged commit c3160a7 into Slicer:main Apr 24, 2024
3 checks passed
@jcfr jcfr deleted the transition-from-s4ext-to-json branch April 24, 2024 03:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants