Skip to content

Conversation

crazytonyli
Copy link
Contributor

This PR adds models for parsing the "plugin information" API endpoint in wordpress.org. You can find some documentation about the endpoint here.

The parsing code is validated using all the plugins in the plugin directory. There are a total of 43k plugins. I have downloaded all the plugin info content and uploaded them to the a8c-ci-cache S3 bucket, which is used in CI to verify the parsing code is valid for all plugins.

Some JSON response fields have special parsing logic, because they have special "empty values" (false, []) which do not match their non-empty values (like an object or a string). I have added many unit tests to make sure those fields are parsed correctly. However, since these unit tests test against specific plugins, which may go away in the future, we may need to update these unit tests as we go.

Copy link
Contributor

@oguzkocer oguzkocer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@crazytonyli This looks good. I have a some minor concern about deserialize_default_values being somewhat easy to misuse, but since it's isolated in this new crate, I don't think we need to do anything at the moment. If we have to roll it into the wp_api crate, we should probably consider renaming it into something more verbose to avoid the accidental misuse.

I left some minor line comments, and have some suggestions about the testing setup:

  1. These tests would be considered integration tests because they require external data and require access to file system. So, I think we should move them to tests directory.
  2. We can use a build script to download the necessary files for these tests. That way it can be part of the cargo build and we can run cargo test without having to explicitly call the make command. We have wp_api_integration_tests/build.rs that creates test credentials for the wp_api integration tests, you can use that for inspiration if you'd like. Note that we don't do this for start-server because it's currently an expensive operation to restart the server. Having said that, there should be a way to check if the server is running, so maybe we should look into that as well. (I revised this suggestion a bit at the end)
  3. I think we should download plugin-directory.tar.gz file to somewhere in target or temp directory of the OS instead of the root folder of wordpress-rs, so that it'll get cleaned up.
  4. After running ./scripts/plugin-directory.sh download_from_s3, plugin-directory.tar.gz the file ended up in my wordpress-rs root folder as an unchecked file. I think we should delete it after we are done with it.
  5. It'd be great if we had some unit tests that didn't require an external json file. That'd make it easier to see what the custom deserializer is supposed to do.
  6. These tests are quite expensive if someone doesn't have access to our S3 bucket. They are also not particularly useful for local development. So, I was wondering if we should mark them as #[ignore] and specifically run them in CI - probably in a step of its own. If we had a few unit tests as suggested in [5], it'd cover the local development, and CI can ensure that we are able to parse all plugins.
  7. Assuming we are going to run these tests in CI using S3, we need a scheduled job to update S3 plugins from the source.

I was re-reading my suggestions, and maybe [2] doesn't make much sense especially if we do [6]. The reason being, we don't know which method we should use to download the plugin directory. If we use S3, it'll only work for us and if we use wp.org directory, it'll be very slow.

Cargo.toml Outdated
"wp_derive_request_builder",
"wp_serde_helper",
"wp_uniffi_bindgen",
"wp_org_api",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this name might be a bit confusing, because technically wp_api is the .org API. I am not sure if the API we are using to fetch the plugin directory has any other endpoints, so maybe we should just call it wp_org_api_plugin_directory?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we'd want _plugin_directory in the library name, because there may be more we'll add, like the theme directory.

What do you think about wordpress_org_api?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think about wordpress_org_api?

My 2¢ – this is a good compromise. There are other .org APIs that we'll want to support (the theme directory, for instance)

@crazytonyli
Copy link
Contributor Author

I have a some minor concern about deserialize_default_values being somewhat easy to misuse, but since it's isolated in this new crate, I don't think we need to do anything at the moment.

It's not intended to be re-used. It's only used for parsing the unconventional response schema in the plugin directory (maybe the theme directory too, but I didn't check). I don't expect it'd be useful to the REST API code.

It'd be great if we had some unit tests that didn't require an external json file.

I didn't add any unit tests, because validating against the whole plugin directory should cover all possible code path. However, thinking about it now, I should have added some unit tests for the special serde parsing code.

I was wondering if we should mark them as #[ignore] and specifically run them in CI - probably in a step of its own. If we had a few unit tests as suggested in [5], it'd cover the local development, and CI can ensure that we are able to parse all plugins.

Sounds like a good idea. 👍

Assuming we are going to run these tests in CI using S3, we need a scheduled job to update S3 plugins from the source.

I was thinking about this. I think we would need that, but probably in a separate PR.

@jkmassel
Copy link
Contributor

jkmassel commented Dec 2, 2024

I'm happy to see that we're testing against every plugin in the directory (because we can't miss issues that way!)

I'd suggest the following though:

  1. Run the full test suite (including downloading every plugin and running the test suite against it) in CI nightly. It doesn't have to be fast – we don't want to DDOS the .org servers.
  2. Take whatever special-case plugins have weird stuff going on and copy them into the project (we can anonymize the data if needed – it's the structure we care about more)
  3. If some plugin starts failing the nightly checks, it gets downloaded and put into the project-local test suite as a json file

We can do the same thing for themes down the road.

Relatedly, I think the test in #1 can also cover our parsing of the plugin list, and any failing test cases can be added to a separate json file so we can unit test the list parsing too

@crazytonyli crazytonyli requested a review from oguzkocer December 2, 2024 23:38
@crazytonyli
Copy link
Contributor Author

Thanks for the suggestions about CI. I'll make those changes in a separate PR because I'll need to change a few places: updating the CI pipeline code here and adding a nightly job to buidkite-ci.

Copy link
Contributor

@oguzkocer oguzkocer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good on its own, but I haven't looked at the new PR yet. I left one comment and will approve once that's addressed. Thanks for the changes!

Cargo.toml Outdated
"wp_serde_helper",
"wp_uniffi_bindgen",
"wordpress_org_api",
"wordpress_org_api_integration_tests",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need a separate crate for these integration tests. We can just add the tests to tests directory in wordpress_org_api crate.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Addressed in 5dff3cf

@crazytonyli crazytonyli merged commit 94f648d into trunk Dec 4, 2024
23 checks passed
@crazytonyli crazytonyli deleted the wordpress-org-plugin-directory branch December 4, 2024 12:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants