Skip to content
This repository has been archived by the owner on Jun 23, 2024. It is now read-only.

Commit

Permalink
Fix HttpUtils.is_blob
Browse files Browse the repository at this point in the history
  • Loading branch information
SanmerDev committed Sep 5, 2023
1 parent d2d7200 commit 974a3b4
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion sync/utils/HttpUtils.py
Original file line number Diff line number Diff line change
Expand Up @@ -57,7 +57,7 @@ def download(cls, url: str, out: Path) -> float:

@classmethod
def is_blob(cls, url):
pattern = r"https://github\.com/.+/blob/main/.+"
pattern = r"https://github\.com/.+/blob/.+"
match = re.match(pattern, url)
if match:
return True
Expand Down

5 comments on commit 974a3b4

@IzzySoft
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As the changelog mentions this as

Filter out changelog of type https://github\.com/.+/blob/.+

You could replace /blob/ with /raw/. And to be safe:

pattern = r"https://github\.com/[^/]+/[^/]+/blob/.+"

else it would also filter any directory named blob, e.g. with https://github.com/foo/bar/raw/some/blob/file.txt.

@SanmerDev
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://github.com/ya0211/magisk-modules-repo-util/raw/main/README.md

This is also acceptable because we need the URL that points to the raw content.

https://github.com/ya0211/magisk-modules-repo-util/blob/main/README.md

This URL points to the content after being processed by GitHub (a web page), so it is not acceptable.

@IzzySoft
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's clear. But

  • .+ is "greedy". In r"https://github\.com/.+/blob/.+" the first .+ matches any number of directory levels – not just owner/repo but also owner/repo/raw/ten/levels/more, and then it's no longer the web page created by Github. Admitted, my example was not clear, so let me give another one you probably would not want to skip: github.com/owner/repo/raw/some/blob/dir/changelog.md. Edge cases of course, but those could exist. Hence my suggestion to improve the RegEx.
  • as your examples show, simply replacing (the correct) blob with raw would lead to the acceptable URL. If you already do that in another place, I of course withdraw this suggestion 😉

@SanmerDev
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I overlooked the impact of .+ (🤪. I will make the necessary modifications for this.

Thanks for the suggestion.

@IzzySoft
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! And gladly 😃

Please sign in to comment.