Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
59 commits
Select commit Hold shift + click to select a range
00126e6
Chore(deps): Bump jinja2 from 3.1.2 to 3.1.3
dependabot[bot] Jan 11, 2024
645829e
Merge pull request #188 from Commonjava/dependabot/pip/jinja2-3.1.3
ligangty Jan 11, 2024
4010f82
Upgrade moto version to 3.0.7
ligangty Jan 30, 2024
23c94f0
Disable two linters in gh action
ligangty Jan 31, 2024
53b0883
Merge pull request #189 from ligangty/upd
ligangty Feb 1, 2024
1f7e4f0
Split commands into separate files
ligangty Feb 6, 2024
363573b
Merge pull request #190 from ligangty/cmd
ligangty Feb 19, 2024
b711bf8
Add content digest util method
ligangty Feb 19, 2024
380df25
Merge pull request #191 from ligangty/utils
ligangty Feb 19, 2024
e7cf9ef
Add command for re-index of folder
ligangty Feb 9, 2024
d64ad34
Merge pull request #193 from ligangty/index
ligangty Feb 23, 2024
7abdd29
Fix a wrong logger typo
ligangty Feb 23, 2024
f2a446e
Merge pull request #195 from ligangty/index
ligangty Feb 23, 2024
f004f47
Add command of checksum validation by using http way
ligangty Feb 19, 2024
7f6317a
Merge pull request #192 from ligangty/checksum-http
ligangty Feb 28, 2024
33592de
Chore(deps): Bump urllib3 from 1.26.15 to 1.26.18
dependabot[bot] Feb 28, 2024
4d53f0e
Merge pull request #196 from Commonjava/dependabot/pip/urllib3-1.26.18
ligangty Feb 28, 2024
54cf5bd
Fix a bug for re-index
ligangty Feb 28, 2024
82f255e
Merge pull request #197 from ligangty/index
ligangty Feb 28, 2024
b58e685
Some chore fix
ligangty Feb 29, 2024
3dd7e96
Merge pull request #198 from ligangty/main
ligangty Feb 29, 2024
41d2b32
Use HTMLParser instead bs4 in checksum validation
ligangty Mar 20, 2024
c0feb5d
Merge pull request #199 from ligangty/checksum-http
ligangty Mar 20, 2024
0fbf7ea
Mark sample files
ligangty Mar 20, 2024
b9cd1b5
Merge pull request #200 from ligangty/template
ligangty Mar 20, 2024
2468c20
Add support for CloudFront invalidating
ligangty Mar 14, 2024
aa2e110
Add new command to clear CF cache
ligangty Mar 26, 2024
b09b10e
Refine some logging
ligangty Mar 27, 2024
a92f24e
Merge pull request #201 from ligangty/cf
ligangty Mar 27, 2024
a52ba7a
Use wildcard for paths in maven CF invalidating
ligangty Mar 27, 2024
ccd5b55
Merge pull request #202 from ligangty/cf
ligangty Mar 28, 2024
a6ebcbb
Change cf cmd name to cf-invalidate
ligangty Mar 28, 2024
626a346
Add command to do CF invalidation status check
ligangty Mar 28, 2024
8238fd5
Merge pull request #203 from ligangty/cf
ligangty Mar 28, 2024
78e40bf
Refine the command
ligangty Mar 28, 2024
b1765ed
Merge pull request #204 from ligangty/cf
ligangty Mar 28, 2024
8317e4b
Fix typo for domain check
ligangty Mar 28, 2024
71c4725
Merge pull request #205 from ligangty/cf
ligangty Mar 28, 2024
c7cdb9e
Some updates
ligangty Apr 1, 2024
0800dcc
Merge pull request #206 from ligangty/cf
ligangty Apr 2, 2024
6566aac
Fix: re-index wrong usage of the type
ligangty Apr 3, 2024
a669066
Merge pull request #207 from ligangty/cf
ligangty Apr 3, 2024
086cb81
Fix two issues
ligangty Apr 3, 2024
6814768
Merge pull request #210 from ligangty/cf
ligangty Apr 3, 2024
da9a6e4
Wait for each invalidation request's completion
ligangty Apr 3, 2024
65c0f61
Merge pull request #211 from ligangty/cf
ligangty Apr 10, 2024
83aa6b4
Fix wrong picking of the npm package.json
ligangty Apr 3, 2024
67f60da
Merge pull request #212 from ligangty/cf
ligangty Apr 11, 2024
adde08c
Refine the output for cf invalidation request
ligangty Apr 11, 2024
105e6c3
Merge pull request #213 from ligangty/cf
ligangty Apr 11, 2024
28a7241
Add extra 1s wait for next CF invalidation request
ligangty Apr 11, 2024
0b73841
Merge pull request #214 from ligangty/cf
ligangty Apr 11, 2024
fbe8af7
Add progress counting for CF requests processing
ligangty Apr 11, 2024
618dc2c
Merge pull request #215 from ligangty/cf
ligangty Apr 11, 2024
a18fb7f
Fix a simple logging issue
ligangty Apr 11, 2024
f26cf54
Merge pull request #216 from ligangty/main
ligangty Apr 12, 2024
13b1459
Update release info for spec file of 1.3.0
ligangty Apr 12, 2024
44d173b
Merge pull request #217 from ligangty/main
ligangty Apr 12, 2024
32ddde4
Merge branch 'main' into release
ligangty Apr 12, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
54 changes: 27 additions & 27 deletions .github/workflows/linters.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -29,16 +29,16 @@ jobs:
- name: Run flake8 on python${{ matrix.python-version }}
run: python -m tox -e flake8

markdownlint:
name: Markdownlint
runs-on: ubuntu-latest
# markdownlint:
# name: Markdownlint
# runs-on: ubuntu-latest

steps:
- name: Check out repo
uses: actions/checkout@v2
# steps:
# - name: Check out repo
# uses: actions/checkout@v2

- name: Run markdownlint
uses: containerbuildsystem/actions/markdownlint@master
# - name: Run markdownlint
# uses: containerbuildsystem/actions/markdownlint@master

pylint:
name: Pylint analyzer for Python ${{ matrix.python-version }}
Expand Down Expand Up @@ -91,22 +91,22 @@ jobs:
# - name: Run mypy on python${{ matrix.python-version }}
# run: python -m tox -e mypy

bandit:
name: Bandit analyzer for Python ${{ matrix.python-version }}
runs-on: ubuntu-latest

strategy:
matrix:
python-version: [ "3.8" ]

steps:
- uses: actions/checkout@v1
- uses: actions/setup-python@v4
with:
python-version: ${{ matrix.python-version }}
- name: Install dependencies
run: |
python -m pip install --upgrade pip setuptools tox

- name: Run bandit analyzer on python${{ matrix.python-version }}
run: python -m tox -e bandit
# bandit:
# name: Bandit analyzer for Python ${{ matrix.python-version }}
# runs-on: ubuntu-latest

# strategy:
# matrix:
# python-version: [ "3.8" ]

# steps:
# - uses: actions/checkout@v1
# - uses: actions/setup-python@v4
# with:
# python-version: ${{ matrix.python-version }}
# - name: Install dependencies
# run: |
# python -m pip install --upgrade pip setuptools tox

# - name: Run bandit analyzer on python${{ matrix.python-version }}
# run: python -m tox -e bandit
4 changes: 4 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -11,8 +11,12 @@ coverage
.vscode
package/
.local
local
.DS_Store

# Unit test
__pytest_reports
htmlcov

# Generated when local run
*.log
18 changes: 18 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -96,3 +96,21 @@ This command will delete some paths from repo in S3.
but not delete the artifacts themselves.
* During or after the paths' deletion, regenerate the
metadata files and index files for both types.

### charon-index: refresh the index.html for the specified path

```bash
usage: charon index $PATH [-t, --target] [-D, --debug] [-q, --quiet]
```

This command will refresh the index.html for the specified path.

* Note that if the path is a NPM metadata path which contains package.json, this refreshment will not work because this type of folder will display the package.json instead of the index.html in http request.

### charon-validate: validate the checksum of files in specified path in a maven repository

```bash
usage: charon validate $path [-t, --target] [-f, --report_file_path] [-i, --includes] [-r, --recursive] [-D, --debug] [-q, --quiet]
```

This command will validate the checksum of the specified path for the maven repository. It will calculate the sha1 checksum of all artifact files in the specified path and compare with the companied .sha1 files of the artifacts, then record all mismatched artifacts in the report file. If some artifact files misses the companied .sha1 files, they will also be recorded.
10 changes: 9 additions & 1 deletion charon.spec
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,6 @@ Requires: python%{python3_pkgversion}-zipp
Requires: python%{python3_pkgversion}-attrs
Requires: python%{python3_pkgversion}-pyrsistent


%description
Simple Python tool with command line interface for charon init,
upload, delete, gen and ls functions.
Expand Down Expand Up @@ -81,6 +80,15 @@ export LANG=en_US.UTF-8 LANGUAGE=en_US.en LC_ALL=en_US.UTF-8


%changelog
* Fri Apr 12 2024 Gang Li <gli@redhat.com>
- 1.3.0 release
- Add validate command: validate the checksum for maven artifacts
- Add index command: support to re-index of the speicified folder
- Add CF invalidating features:
- Invalidate generated metadata files (maven-metadata*/package.json/index.html) after product uploading/deleting in CloudFront
- Add command to do CF invalidating and checking
- Fix bug: picking the root package.json as the first priority one to generate npm package path

* Mon Sep 18 2023 Harsh Modi <hmodi@redhat.com>
- 1.2.2 release
- hot fix for "dist_tags" derived issue
Expand Down
6 changes: 0 additions & 6 deletions charon/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,9 +13,3 @@
See the License for the specific language governing permissions and
limitations under the License.
"""

from charon.cmd.command import cli, upload, delete

# init group command
cli.add_command(upload)
cli.add_command(delete)
204 changes: 204 additions & 0 deletions charon/cache.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,204 @@
from boto3 import session
from botocore.exceptions import ClientError
from typing import Dict, List
import os
import logging
import uuid
import time

logger = logging.getLogger(__name__)

ENDPOINT_ENV = "aws_endpoint_url"
INVALIDATION_BATCH_DEFAULT = 3000
INVALIDATION_BATCH_WILDCARD = 15

INVALIDATION_STATUS_COMPLETED = "Completed"
INVALIDATION_STATUS_INPROGRESS = "InProgress"

DEFAULT_BUCKET_TO_DOMAIN = {
"prod-ga": "maven.repository.redhat.com",
"prod-maven-ga": "maven.repository.redhat.com",
"prod-ea": "maven.repository.redhat.com",
"prod-maven-ea": "maven.repository.redhat.com",
"stage-ga": "maven.stage.repository.redhat.com",
"stage-maven-ga": "maven.stage.repository.redhat.com",
"stage-ea": "maven.stage.repository.redhat.com",
"stage-maven-ea": "maven.stage.repository.redhat.com",
"prod-npm": "npm.registry.redhat.com",
"prod-npm-npmjs": "npm.registry.redhat.com",
"stage-npm": "npm.stage.registry.redhat.com",
"stage-npm-npmjs": "npm.stage.registry.redhat.com"
}


class CFClient(object):
"""The CFClient is a wrapper of the original boto3 clouldfrong client,
which will provide CloudFront functions to be used in the charon.
"""

def __init__(
self,
aws_profile=None,
extra_conf=None
) -> None:
self.__client = self.__init_aws_client(aws_profile, extra_conf)

def __init_aws_client(
self, aws_profile=None, extra_conf=None
):
if aws_profile:
logger.debug("[CloudFront] Using aws profile: %s", aws_profile)
cf_session = session.Session(profile_name=aws_profile)
else:
cf_session = session.Session()
endpoint_url = self.__get_endpoint(extra_conf)
return cf_session.client(
'cloudfront',
endpoint_url=endpoint_url
)

def __get_endpoint(self, extra_conf) -> str:
endpoint_url = os.getenv(ENDPOINT_ENV)
if not endpoint_url or not endpoint_url.strip():
if isinstance(extra_conf, Dict):
endpoint_url = extra_conf.get(ENDPOINT_ENV, None)
if endpoint_url:
logger.info(
"[CloudFront] Using endpoint url for aws CF client: %s",
endpoint_url
)
else:
logger.debug("[CloudFront] No user-specified endpoint url is used.")
return endpoint_url

def invalidate_paths(
self, distr_id: str, paths: List[str],
batch_size=INVALIDATION_BATCH_DEFAULT
) -> List[Dict[str, str]]:
"""Send a invalidating requests for the paths in distribution to CloudFront.
This will invalidate the paths in the distribution to enforce the refreshment
from backend S3 bucket for these paths. For details see:
https://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/Invalidation.html
* The distr_id is the id for the distribution. This id can be get through
get_dist_id_by_domain(domain) function
* Can specify the invalidating paths through paths param.
* Batch size is the number of paths to be invalidated in one request.
The default value is 3000 which is the maximum number in official doc:
https://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/Invalidation.html#InvalidationLimits
"""
INPRO_W_SECS = 5
NEXT_W_SECS = 1
real_paths = [paths]
# Split paths into batches by batch_size
if batch_size:
real_paths = [paths[i:i + batch_size] for i in range(0, len(paths), batch_size)]
total_time_approx = len(real_paths) * (INPRO_W_SECS * 2 + NEXT_W_SECS)
logger.info("There will be %d invalidating requests in total,"
" will take more than %d seconds",
len(real_paths), total_time_approx)
results = []
current_invalidation = {}
processed_count = 0
for batch_paths in real_paths:
while (current_invalidation and
INVALIDATION_STATUS_INPROGRESS == current_invalidation.get('Status', '')):
time.sleep(INPRO_W_SECS)
try:
result = self.check_invalidation(distr_id, current_invalidation.get('Id'))
if result:
current_invalidation = {
'Id': result.get('Id', None),
'Status': result.get('Status', None)
}
logger.debug("Check invalidation: %s", current_invalidation)
except Exception as err:
logger.warning(
"[CloudFront] Error occurred while checking invalidation status during"
" creating invalidation, invalidation: %s, error: %s",
current_invalidation, err
)
break
if current_invalidation:
results.append(current_invalidation)
processed_count += 1
if processed_count % 10 == 0:
logger.info(
"[CloudFront] ######### %d/%d requests finished",
processed_count, len(real_paths))
# To avoid conflict rushing request, we can wait 1s here
# for next invalidation request sending.
time.sleep(NEXT_W_SECS)
caller_ref = str(uuid.uuid4())
logger.debug(
"Processing invalidation for batch with ref %s, size: %s",
caller_ref, len(batch_paths)
)
try:
response = self.__client.create_invalidation(
DistributionId=distr_id,
InvalidationBatch={
'CallerReference': caller_ref,
'Paths': {
'Quantity': len(batch_paths),
'Items': batch_paths
}
}
)
if response:
invalidation = response.get('Invalidation', {})
current_invalidation = {
'Id': invalidation.get('Id', None),
'Status': invalidation.get('Status', None)
}
except Exception as err:
logger.error(
"[CloudFront] Error occurred while creating invalidation"
" for paths %s, error: %s", batch_paths, err
)
if current_invalidation:
results.append(current_invalidation)
return results

def check_invalidation(self, distr_id: str, invalidation_id: str) -> dict:
try:
response = self.__client.get_invalidation(
DistributionId=distr_id,
Id=invalidation_id
)
if response:
invalidation = response.get('Invalidation', {})
return {
'Id': invalidation.get('Id', None),
'CreateTime': str(invalidation.get('CreateTime', None)),
'Status': invalidation.get('Status', None)
}
except Exception as err:
logger.error(
"[CloudFront] Error occurred while check invalidation of id %s, "
"error: %s", invalidation_id, err
)

def get_dist_id_by_domain(self, domain: str) -> str:
"""Get distribution id by a domain name. The id can be used to send invalidating
request through #invalidate_paths function
* Domain are Ronda domains, like "maven.repository.redhat.com"
or "npm.registry.redhat.com"
"""
try:
response = self.__client.list_distributions()
if response:
dist_list_items = response.get("DistributionList", {}).get("Items", [])
for distr in dist_list_items:
aliases_items = distr.get('Aliases', {}).get('Items', [])
if aliases_items and domain in aliases_items:
return distr['Id']
logger.error("[CloudFront]: Distribution not found for domain %s", domain)
except ClientError as err:
logger.error(
"[CloudFront]: Error occurred while get distribution for domain %s: %s",
domain, err
)
return None

def get_domain_by_bucket(self, bucket: str) -> str:
return DEFAULT_BUCKET_TO_DOMAIN.get(bucket, None)
23 changes: 23 additions & 0 deletions charon/cmd/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,3 +13,26 @@
See the License for the specific language governing permissions and
limitations under the License.
"""
from click import group
from charon.cmd.cmd_upload import upload
from charon.cmd.cmd_delete import delete
from charon.cmd.cmd_index import index
from charon.cmd.cmd_checksum import checksum_validate
from charon.cmd.cmd_cache import cf_invalidate, cf_check


@group()
def cli():
"""Charon is a tool to synchronize several types of
artifacts repository data to Red Hat Ronda
service (maven.repository.redhat.com).
"""


# init group command
cli.add_command(upload)
cli.add_command(delete)
cli.add_command(index)
cli.add_command(checksum_validate)
cli.add_command(cf_invalidate)
cli.add_command(cf_check)
Loading