-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(create-expo): avoid corrupting .tar.gz files in templates #26741
fix(create-expo): avoid corrupting .tar.gz files in templates #26741
Conversation
tar.extract(), by default, parses files from within a .tar.gz as UTF-8. This is generally incorrect for binary files. I found, when creating an Expo template involving a .tar.gz file (or equally a .zip file) inside, that create-expo would successfully unpack most of the template, but corrupt the inner .tar.gz file. I've applied this minimal fix to create-expo, and also to @expo/cli (as it seems the createFileTransform() implementation is mirrored there).
4df58f9
to
96a3ac7
Compare
Follow-up: This certainly fixes one problem, but it caught me again in another case (this time with a file other than Really starting to think the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, thanks @shirakaba ❤️ Let's keep chatting about a possible better fix to account for this scenario.
# Why `tar.extract()` (from [node-tar](https://github.com/isaacs/node-tar), used by `create-expo` [here](https://github.com/expo/expo/blob/8d29bbba0ccd4ecb35bac0df7b512ee129d3cde8/packages/create-expo/src/Examples.ts#L87)), by default, parses files from within a `.tar.gz` as UTF-8. This is generally incorrect for binary files inside the tarball. I found, when creating an Expo template containing a `.tar.gz` file (or equally a `.zip` file) inside, that `create-expo` would successfully unpack most of the template, but corrupt the inner `.tar.gz` file. It affects both cloning from GitHub and from a local tarball. I've provisionally applied this minimal fix to `create-expo`, and also to `@expo/cli` (as it seems the `createFileTransform()` implementation is mirrored there). We could of course cover more types like `.zip`, but as the scope is basically "all possible binary files", I think the list is endless. Importance: I can give Expo superpowers if this PR lands 👀 # How ```sh # First, duplicate expo-template-bare-minimum. # Installing it from npm is a convenient way to do so. mkdir -vp workdir/expo-template-custom cd workdir npm install expo-template-bare-minimum mv node_modules/expo-template-bare-minimum expo-template-custom # Now, add any .tar.gz file to it. # Let's say it's called example.tar.gz and is 70 MB in size. mv path/to/example.tar.gz expo-template-custom cd expo-template-custom npm pack cd .. # Next, install the CLI locally. npm install create-expo-app@latest # Now make the patch to node_modules/create-expo-app/build/index.js. # Create a new project using your local Expo template, via the forked CLI ./node_modules/.bin/create-expo-app --template ./expo-template-custom/expo-template-custom-50.0.33.tgz verificationapp # Look inside the created app and check whether the size of example.tar.gz is still 70 MB. # Without this patch, it was ballooning to 140 MB, so the difference is easy to spot. ls -la verificationapp ``` # Test Plan Local test plan covered in the "How" section above. I think there is no risk to this change as none of the templates include `.tar.gz` files currently, and all it does is fix something that's broken to begin with. # Checklist I think I'll check the same choices as Cedric did in this [similar PR](#26639). - [x] Documentation is up to date to reflect these changes (eg: https://docs.expo.dev and README.md). - [ ] Conforms with the [Documentation Writing Style Guide](https://github.com/expo/expo/blob/main/guides/Expo%20Documentation%20Writing%20Style%20Guide.md) - [ ] This diff will work correctly for `npx expo prebuild` & EAS Build (eg: updated a module plugin). --------- Co-authored-by: Jamie Birch <jamie@MacBook-Pro.local> Co-authored-by: Cedric van Putten <github@cedric.dev>
…plates (#27212) # Why Continuing on from #26741 where I merely kicked the can down the road, here I implement a proper lasting solution. As explained in the previous PR, both `create-expo` and `@expo/cli` corrupt certain files when extracting templates due to how they rename "HelloWorld" to the user's specified app name during extraction. Specifically, this seems to cause binary files (e.g. executables) to double in size, presumably due to an accidental encoding change. While select file types are spared from this process [based on their file extension](https://github.com/expo/expo/blob/ce141b51fb8aaaa8e11038dd997ab6a3f22f2429/packages/create-expo/src/createFileTransform.ts#L60-L73), it's not sustainable to guess every possible file extension that should be exempt (we're missing `.zip`, for one thing), and overlooks files that have no extension at all (like binaries). So I thought that maybe we've been doing it the wrong way round. Instead of renaming *all but some* files, we should **explicitly select which ones get renamed**. This would be configurable, of course – we'd have sensible defaults that would avoid most devs ever having to think about the process, but we'd also support templates including their own config to handle any exotic cases. Importance: this allows us to safely commit to templates the following files: * prebuilt binaries like `.xcframework`s * arbitrary compressed files like `.zip`, for reducing download times * more media types (all kinds of videos) * files that you actually want to write "HelloWorld" into, for some reason … It also finally makes the renaming procedure atomic and easily testable (hence the many new tests). # How <!-- How did you build this feature or fix this bug and why? --> The feature is implemented in both `@expo/cli` and `create-expo`: * I removed the file contents rewriting process from the tar extractor (see changes to `createFileTransform.ts`). * I kept the file path rewriting process of the tar extractor, however, as that seemed to do more good than harm (but yes, maybe deserves a followup). * I introduced two post-extraction steps: 1. `getTemplateFilesToRenameAsync()`, which accepts a "rename config", or otherwise falls back to the CLI's default (which hopefully covers most cases in existence anyway) to determine which files should be renamed 2. `renameTemplateAppNameAsync()`, which, for a given list of files, renames “HelloWorld” to the desired app name with appropriate sanitsation * I added unit tests and E2E tests for each of these functions * I worked out a default rename config (search for `defaultRenameConfig`) that seems to cover all template cases in the monorepo, though I may have possibly overlooked some cases in templates outside the monorepo. ## About the rename config See the documentation in the updated `packages/create-expo/README.md`. # Test Plan <!-- Please describe how you tested this change and how a reviewer could reproduce your test, especially if this PR does not include automated tests! If possible, please also provide terminal output and/or screenshots demonstrating your test/reproduction. --> I've added lots of automated tests to give proper confidence, but manual testing can be performed by building Create Expo and using them: * Check out the branch of interest (either `main` or this PR, `@shirakaba/extract-binaries-safely`) * Build Create Expo and create a template with it ```sh # cd to wherever you keep your Expo monorepo cd ~/expo # Build create-expo cd packages/create-expo yarn build # cd to some other work directory. We'll build a template # in there. cd ~/Downloads ~/expo/packages/create-expo/build/index.js # Make an app called "ByeWorld", then move it into a # `new` or `old` directory for easy comparison later ``` * Having generated a template using a build of Create Expo derived from both `main` and this PR's branch, you can finally compare the created templates to assess that there have been no regressions. If this command produces no output, then there's no difference (and thus no regression): ```sh diff -r --exclude=.git new/ByeWorld old/ByeWorld ``` ![image](https://github.com/expo/expo/assets/14055146/4271aeb0-1b4d-4d7e-92e4-927bca060af4) # Checklist - [x] Documentation is up to date to reflect these changes (eg: https://docs.expo.dev and README.md). - [ ] Conforms with the [Documentation Writing Style Guide](https://github.com/expo/expo/blob/main/guides/Expo%20Documentation%20Writing%20Style%20Guide.md) - [x] This diff will work correctly for `npx expo prebuild` & EAS Build (eg: updated a module plugin). # Related #27071 --------- Co-authored-by: Jamie Birch <jamie@MacBook-Pro.local> Co-authored-by: Cedric van Putten <me@bycedric.com>
Why
tar.extract()
(from node-tar, used bycreate-expo
here), by default, parses files from within a.tar.gz
as UTF-8. This is generally incorrect for binary files inside the tarball. I found, when creating an Expo template containing a.tar.gz
file (or equally a.zip
file) inside, thatcreate-expo
would successfully unpack most of the template, but corrupt the inner.tar.gz
file. It affects both cloning from GitHub and from a local tarball.I've provisionally applied this minimal fix to
create-expo
, and also to@expo/cli
(as it seems thecreateFileTransform()
implementation is mirrored there). We could of course cover more types like.zip
, but as the scope is basically "all possible binary files", I think the list is endless.Importance: I can give Expo superpowers if this PR lands 👀
How
Test Plan
Local test plan covered in the "How" section above. I think there is no risk to this change as none of the templates include
.tar.gz
files currently, and all it does is fix something that's broken to begin with.Checklist
I think I'll check the same choices as Cedric did in this similar PR.
npx expo prebuild
& EAS Build (eg: updated a module plugin).