Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: generate_library.sh with postprocessing #1951

Merged
merged 203 commits into from Oct 31, 2023

Conversation

diegomarquezp
Copy link
Contributor

@diegomarquezp diegomarquezp commented Aug 30, 2023

This PR adds post-processing to generate_library.sh
The changes are intended for pure GAPICs in google-cloud-java

Remarks:

  • Although some HW libraries can be generated correctly, we are still at mercy of unmerged PRs (doc) that could affect the contents of its main branch. We are skipping testing against HW libraries
  • For the case of java-bigtable and dialogflow, we have owlbot.py files referencing libraries or versions that have not been generated by the current run of generate_library.sh (doc). For this reason, HW libraries are excluded as well as dialogflow, which is left as a commented row for future inspection
  • We keep googleapis-gen tests as a separate option in the integration tests, which can be activated by using --enable_postprocessing=false. This is a convenience for local development
  • Some libraries will produce package-info.java files with incorrect order of variables, although logically equivalent. This is solved by using LC_COLLATE=C for byte-level sorting of proto files when passing as input to the gapic generator.
  • At some point the script was capable of new-library generation, for which most changes were removed in b3169c3 and d03aad6 for the sake of restricting this PR to consider only pre-existing libraries. It was able to generate some gapic libraries (untouched by humans) that produced no differences even at the pom.xml level, whitespace included

Integration test

  1. sparse clone google-cloud-java and the specified repository_path (e.g. google-cloud-java/java-asset)
  2. run generate_library.sh.
  3. The script removes the workspace folder every time it's called
  4. Transfer the post-processed files to the repository_path in google-cloud-java
  5. Perform git diff
  6. Run compare-poms.py to compare pom.xml files ignoring element order and whitespace
  7. If git diff or compare-poms.py found differences, exit with non-zero and stop the test suite
  8. Run for all libraries

Post-processing workflow:

  1. Prepare output from pre-processed library into a folder structure that mimics googleapis-gen (which is the way it's done in other usages, such as new-library.py. For example, java-asset would be prepared in a folder tree "google/cloud/asset/v2"
  2. This folder is then sent as a volume/argument to a owlbot-cli copy-code
  3. The files are arranged as specified in the library's pre-existing .OwlBot.yaml. The pre-existing necessary files are specified with repository_path (which can be, for example, google-cloud-java/java-asset). The files are then sent to an owl-bot-staging folder
  4. The owl-bot-staging folder is sent to a workspace folder (root directory the owlbot postprocessor works with)
  5. We transfer all pre-existing pom.xml files and owlbot.py to the workspace folder
  6. We run the post-processor in the prepared workspace folder
  7. We run apply_current_versions.sh (a modified file borrowed from google-cloud-java)

@googleapis googleapis deleted a comment from sonarcloud bot Oct 26, 2023
@googleapis googleapis deleted a comment from sonarcloud bot Oct 26, 2023
exit 0
fi
if [ -z "${versions_file}" ];then
echo "no versions.txt argument provided. Please provide one in order to enable post-processing"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's OK to make it required for now since it's only for existing libraries. For new client libraries, I'm assuming we have to generate versions.txt first before calling this script? Is it generated by new client library script currently? cc @JoeWang1127

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@blakeli0 the version of a new library defaults to 0.0.1-SNAPSHOT.

example PR

The new library script applies current versions but this is for other referenced artifacts such as google-cloud-java (example)

I think we still need a versions file for referenced artifacts other than the library being generated when dealing with a new monorepo library

# owlbot
# 4 - scripts_root: location of the generation scripts
# 5 - destination_path: used to transfer the raw grpc, proto and gapic libraries
# 6 - repository_path: path from output_folder to the location of the source of
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still don't fully understand the usefulness of repository_path? If it's for the poms, they should be already in the output folders, am I missing something?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@blakeli0 the pre-existing poms with their folder structure are indeed necessary for the postprocessor.

We need google-cloud-java to be present beforehand in output_folder as well, but the script does not know which folder in the monorepo should obtain the poms, owlbot.py and .OwlBot.yaml from. I don't think there is a good heuristic to infer the monorepo path from proto_path or destination_path. For example looking for .OwlBot.yaml files that have a deep-copy containing destination_path may work but we have cases where the owlbot files do not deal with a single gapic library.

To clarify, the purpose of repository_path is to give the script a way to find the monorepo folder associated with the library being generated. This folder contains the necessary files for owlbot copy and postprocessor.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed, I removed usage of destination_path. The integration test, however, will pass repository_path (e.g. java-asset) as destination_path.

NOTE: The usage of owlbot copy works with a yaml file expecting a certain input folder structure. For example, that yaml has a deep-copy section with wildcards in order to build the owl-bot-staging folder (i.e. owlbot copy sends the files stored in deep-copy to owlbot-staging and the owlbot postprocessor runs on them)
I'm taking advantage of the wildcards for owlbot copy to work. The path in this deep-copy section is referring a path in googleapis-gen (example) which is could be something like google-cloud-asset-v1-java but I just named it generated-java (which will be ok for the wildcard)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm yet to update the readme
edit: done

@googleapis googleapis deleted a comment from sonarcloud bot Oct 30, 2023
@googleapis googleapis deleted a comment from sonarcloud bot Oct 31, 2023
Copy link

sonarcloud bot commented Oct 31, 2023

[gapic-generator-java-root] Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 0 Code Smells

No Coverage information No Coverage information
No Duplication information No Duplication information

Copy link

sonarcloud bot commented Oct 31, 2023

[java_showcase_integration_tests] Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 0 Code Smells

No Coverage information No Coverage information
No Duplication information No Duplication information

@diegomarquezp diegomarquezp merged commit 39b9f0e into main Oct 31, 2023
44 checks passed
@diegomarquezp diegomarquezp deleted the feat/generate-aiplatform-with-postprocessing branch October 31, 2023 23:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hermetic-build size: l Pull request size is large.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants