Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: Add new files. #1006

Closed
wants to merge 7 commits into from
Closed

chore: Add new files. #1006

wants to merge 7 commits into from

Conversation

stephenh
Copy link
Owner

No description provided.

@jk-apple
Copy link
Contributor

Thanks for adding, I thought those files were auto generated.

@dasco144
Copy link
Contributor

I can try rebase and get the builds passing if you're busy @stephenh

@stephenh
Copy link
Owner Author

@dasco144 ah yeah! That would be great! I thought this would be a quick PR to update, but then the build checks didn't immediately pass, for a reason I couldn't immediately figure out, so haven't gotten around to digging into why. :-/ Thank you!

@dasco144
Copy link
Contributor

dasco144 commented Apr 16, 2024

I managed to fix the error in the lint check, but the build check I'm not so sure about.

When I run yarn build:test the output remains the same, and the contents of index.foo.ts are the following

/* eslint-disable */

export * from "./base";
export * from "./extension";

when I then run yarn bin2ts, it updates this file, and removes the second export and we're left with

/* eslint-disable */

export * from "./base";

I can push this but I'm not sure why there is a difference here, as doesn't yarn build:test run the same bin2ts command?

@stephenh
Copy link
Owner Author

@dasco144 for some reason rebuilding my protoc service image made my local machine suddenly start matching the CI output.

...I went down a rabbit hole thinking this was related to specific versions of Node, but actually the output is just non-deterministic.

If I jut run yarn bin2ts extension-import ~3-5 times in a row, I'll get different output. Sometimes the export * from ./extension line is there, sometimes it's not.

This is pretty bizarre; I don't think I've heard had ts-proto produce non-deterministic output before.

This is very annoying. :-/

@stephenh
Copy link
Owner Author

This is the code that generates the index file:

https://github.com/stephenh/ts-proto/blob/main/src/utils.ts#L47

I didn't write it ... I dunno, maybe each invocation of protoc is sending us a different order of the input files? Like sometimes base.proto comes first, and other times extension.proto comes first? Or we're using some JS data structure that is not insertion-order-stable, but afaiu most of the core JS data structures are... 🤔

If you want to poke around, I'd appreciate it, but it seems really tricky...

@dasco144
Copy link
Contributor

I'll see what I can find out 👍 Thanks for taking a look!

@dasco144
Copy link
Contributor

dasco144 commented Apr 20, 2024

I think I've found where this is happening. In codegen.sh, we get all the bin files for the tests we are watching, and that for loop triggers the execution of codegen.ts, and depending on the run, either the base.bin or extension.bin will generate its file first, which will lead to it being overridden by the second file.

You can see that output here, with some console logs as I was trying to find the cause:

image

When I limit the integration tests to only 1, instead of running up to 5 in parallel, you get the same output every time. Not ideal as we lose a ton of speed when running through the integrations.
I'm also not sure if this happens in the normal code gen, as I haven't looked into where that gets triggered from.

I don't know protos and the binaries they generate that well, so I'm not 100% why they would be different. but my guess would be because of the import in the extension.proto? How would we determine which one is the correct one to generate in this instance

@stephenh
Copy link
Owner Author

Oh shoot... @dasco144 that's a great find...

So, seems like this is a synthetic problem created by the ~naive codegen.sh approach i.e.:

  • This directory has two *.proto files
  • Currently codegen.sh invokes ts-proto once per *.proto / *.bin file in the directory
  • Both *.proto files want to write an index file
  • So of course the last one wins

Even if we could make this deterministic, my guess is that either order (a then b, or b then a) are technically wrong, and the test probably actually wants ts-proto to be invoked once with both files passed at the same time--which would result in a single index.ts getting created, and that would fundamentally stop the race condition.

🤔

I'm not sure the best way to teach codegen.sh to do this...

We have this line:

BIN_FILES=$(find $FILTER_PATHS -name "*.bin" -type f | grep -v dump-response.bin)

Maybe we could do a grep -v for the extension-import to get this one directly skipped entirely, and then just manually invoke:

"../node_modules/.bin/tsx" "./codegen.ts" "${TEST_DIR}" "...put bin1 & bin2 here, as like comma separated" "${PARAMS}" &

At the end of codegen.sh, as a special case; but then we'd also need codegen.ts to realize "the file param is actually two *.bin files, so load then separately but then merge the two request.protoFiles before calling generateIndexFiles...

That might work, but making codegen.ts even more of a "guess" about what the real protoc behavior will be; it might be worth it to just do a true protoc invocation (against the *.proto files themselves) for this one example--historically I avoided that because it was really easy to have per-machine protoc version differences, but since then the the docker-hosted protoc setup was added, so it's much less of a concern.

Honestly with the docker-hosted protoc, it would probably be a good idea to revisit the *.bin approach all together, and they may not even be necessarily anymore; which, if the testing infra was ~massively refactored to be "just invoke protoc once per integration test, with all of the *.proto files passed as a single arg", that would be the nuclear way of fixing this race condition. :-)

Dunno...it'd be amazing if you wanted to try either of these approaches (a small incremental hack to codegen.ts to accept multiple *.bin files, or a complete rework that invokes protoc directly and removes codegen.ts + *.bin files all together), but definitely np; all the investigation you've done so far is great, and really appreciated!

@stephenh
Copy link
Owner Author

stephenh commented Jun 5, 2024

Well, took awhile to get to, but I finally re-did this PR in #1050, but following up on the idea of "maybe the *.bin approach has outlived its usefulness".

Now we just directly invoke protoc from *.proto to *.ts and that fixes the non-determinism. Whew! 😅

@stephenh stephenh closed this Jun 5, 2024
@stephenh stephenh deleted the add-missing-files branch June 5, 2024 02:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants