Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

extract out the connector changelog modification out of the bump_version code #34586

Conversation

stephane-airbyte
Copy link
Contributor

@stephane-airbyte stephane-airbyte commented Jan 27, 2024

We want to move the changelog mutation out of the connectors command, so we can reuse it for java CDK.
all connectors were tested, and the only ones that were failing are fixed in #36039

Copy link

vercel bot commented Jan 27, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

1 Ignored Deployment
Name Status Preview Comments Updated (UTC)
airbyte-docs ⬜️ Ignored (Inspect) Visit Preview Mar 14, 2024 0:11am

Copy link
Contributor Author

stephane-airbyte commented Jan 27, 2024

@stephane-airbyte stephane-airbyte force-pushed the stephane/01-27-extract_out_the_connector_changelog_modification_out_of_the_bump_version_code branch 9 times, most recently from 0ea0a74 to f3301f8 Compare January 31, 2024 19:50
@stephane-airbyte stephane-airbyte marked this pull request as ready for review January 31, 2024 20:11
@stephane-airbyte stephane-airbyte requested review from a team and alafanechere January 31, 2024 20:11
Copy link
Contributor

@alafanechere alafanechere left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The overall logic looks perfect to me. I made a couple of suggestion to boost reusability and make it a bit more refactor proof.

Comment on lines 15 to 21
def __init__(self, changelog_file_path: Path) -> None:
self.changelog_file_path = changelog_file_path
if not changelog_file_path.exists():
raise Exception(f"File {changelog_file_path} does not exist")
if not changelog_file_path.is_file():
raise Exception(f"File {changelog_file_path} is not a regular file")
self.original_markdown_lines = changelog_file_path.read_text().splitlines()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we pass the original_markdown_lines as a constructor parameter and avoid path checks logic there?
I'd love to avoid direct filesystem access if possible.
I'd prefer the changelog read to happen from a dagger directory object. This will help us make this Changelog agnostic of the underlying filesystem / path: a dagger directory could come from a container, a remote git repo or the host fs.

Feel free to not change how the documentation file path is fetched, but please change the constructor to take the raw markdown string as an input instead of the file path

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

changelog_entry_re = (
"^\\| *(?P<version>[0-9]+\\.[0-9+]+\\.[0-9]+?) *\\| *"
+ "(?P<day>[0-9]{4}-[0-9]{2}-[0-9]{2}) *\\| *"
+ "\\[?(?P<pr_number1>[0-9]*)\\]? ?\\(https://github.com/airbytehq/airbyte/pull/(?P<pr_number2>[0-9]*)\\) *\\| *"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a github_repo_url parameter to the constructor which defaults to a constant built from this one?
It'll make it less coupled to our repo if we allow airbyte-ci to run on different repo anytime

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

)
return "\n".join(new_lines) + "\n"

def __parse_markdown(self) -> None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a docstring describing the parsing logic?

changelog_header_line_index = line_index
break
if changelog_header_line_index == -1:
raise Exception("Could not find the changelog section table in the documentation file.")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you create a custom ChangelogParsing exception? Exception is too broad 😀

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

self.old_entries.add(changelog_entry)


class ChangelogEntry:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use a frozen dataclass there? It'll make the class leaner and make it explicitely a stateless object

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have no idea what a "frozen dataclass" is... I guess I have some reading to do

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can consider Dataclass to be models / POJO. They're could frozen when they're immutable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

self.pr_number = int(pr_number)
self.comment = comment

def to_str(self) -> str:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def to_str(self) -> str:
def to_markdown(self) -> str:

or

Suggested change
def to_str(self) -> str:
def __str__(self) -> str:

if you'd like to call str(changelog_entry)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

)
return "\n".join(new_lines) + "\n"

def __parse_markdown(self) -> None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we make this a public static or class method which would return a list of changelog entry?
I'd find it more elegant to call it this way in the contrusctor:
self.old_entries = self.parse_markdown(original_markdown_line)

Making it public and a static/class method can simplify testing and reuse in other contexts.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the parse_markdown does a lot more than just returning the old entries. It also sets the start and end lines of the changelog. I guess we could redo that in the to_markdown

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is exactly the approach I suggested before, thank you so much!

airbyte-ci/connectors/pipelines/tests/test_changelog.py Outdated Show resolved Hide resolved
@stephane-airbyte stephane-airbyte force-pushed the stephane/01-27-extract_out_the_connector_changelog_modification_out_of_the_bump_version_code branch 5 times, most recently from 7b3473a to 9338a2e Compare February 2, 2024 05:03
Copy link
Contributor Author

@alafanechere I believe I addressed all your comments.

Copy link
Contributor

@alafanechere alafanechere left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stephane-airbyte the code looks good to me now! Thank you for the changes. I added a minor suggestion.
I manually tried it on a couple of python connectors and realized:

  • The error reporting was not working. I fixed it in 1c91364
  • The exception thrown is The changelog table in the documentation file is missing the header delimiter.

Can you have a look to make sure the majority of our python connectors can be bumped with this command?
I tried:
airbyte-ci connectors --name=source-stripe bump_version minor 123 "new stuff"

return self.__str__().__hash__()


def parse_markdown(markdown_lines: list[str], github_repo: str) -> [int, set[ChangelogEntry]]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def parse_markdown(markdown_lines: list[str], github_repo: str) -> [int, set[ChangelogEntry]]:
def parse_changelog_in_markdown(markdown_lines: list[str], github_repo: str) -> [int, set[ChangelogEntry]]:

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please add a docstring?

@stephane-airbyte stephane-airbyte force-pushed the stephane/01-27-extract_out_the_connector_changelog_modification_out_of_the_bump_version_code branch from 1c91364 to b12bbd7 Compare March 13, 2024 17:53
@stephane-airbyte stephane-airbyte requested a review from a team as a code owner March 13, 2024 17:53
@stephane-airbyte stephane-airbyte force-pushed the stephane/01-27-extract_out_the_connector_changelog_modification_out_of_the_bump_version_code branch 2 times, most recently from 502a276 to 4b43fa9 Compare March 14, 2024 01:34
@stephane-airbyte stephane-airbyte force-pushed the stephane/01-27-extract_out_the_connector_changelog_modification_out_of_the_bump_version_code branch from 4b43fa9 to 723d101 Compare March 14, 2024 01:40
@alafanechere alafanechere force-pushed the stephane/01-27-extract_out_the_connector_changelog_modification_out_of_the_bump_version_code branch from 723d101 to 848e8a7 Compare March 14, 2024 11:20
Copy link
Contributor

@alafanechere alafanechere left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I fixed type checks which failed the CI.
I tried the command on python connectors and it works as expected 👌
Thanks Stephane for this improvement!

@alafanechere alafanechere force-pushed the stephane/01-27-extract_out_the_connector_changelog_modification_out_of_the_bump_version_code branch from 848e8a7 to 51659f3 Compare March 14, 2024 12:11
@stephane-airbyte stephane-airbyte merged commit 3fae982 into master Mar 14, 2024
31 checks passed
@stephane-airbyte stephane-airbyte deleted the stephane/01-27-extract_out_the_connector_changelog_modification_out_of_the_bump_version_code branch March 14, 2024 15:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants