Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Read Webpack stats file as stream to support very large projects #624

Merged
merged 4 commits into from Aug 10, 2022

Conversation

ethriel3695
Copy link
Contributor

This builds off of the changes to the trim-stats-file.
We are now moving to change all instances of the stats file to use readStatsFile which will read stats files from a stream.
This will allow handling larger stats files and avoid hitting a file size limit.

@linear
Copy link

linear bot commented Aug 3, 2022

AP-2270 Add trim-stats-file changes to the preview-stats process

Now that we've tested that the stream read works correctly with trim-stats-file we can make changes to the preview-stats read process and get this working correctly.

export const readStatsFile = async (filePath: string): Promise<Stats> =>
parseChunked(createReadStream(filePath));
export const readStatsFile = async (filePath: string): Promise<Stats> => {
if (existsSync(filePath)) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added this check because I found that in the upload.test.ts test for traced files, it was erroring out because the file path does not return an existing file.

I attempted to mock the file path but because we mock fs-extra it overrides all my mock tests and cannot find the file.

The logic check ensures that we don't error out on a stats file with a directory that we cannot process.

However, once the trace or upload reaches the point where it's processing the stats file, we have already checked for a file path and that the file exists so we should be able to reasonably assume that the file exists and this is probably an indicator that the test needs to be refactored.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is strange is that when you include a combination of the sourceDir and statsPath the upload process builds a path that would most likely not exist in production.

sourceDir: '/static/',
fileInfo: { statsPath: '/static/preview-stats.json' },

This is the path: static/static/preview-stats.json
This is the existing test so I'm hesitant to change it but that file path does not seem like something that would ever exist in a real project.

So is the test incorrect, is the path join incorrect, or is this expected?
@ghengeveld @tmeasday

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ethriel3695 reading where statsPath comes from, it is relative to the sourceDir:

// Get all paths in rootDir, starting at dirname.
// We don't want the paths to include rootDir -- so if rootDir = storybook-static,
// paths will be like iframe.html rather than storybook-static/iframe.html
function getPathsInDir(ctx: Context, rootDir: string, dirname = '.'): PathSpec[] {
try {
return fs.readdirSync(join(rootDir, dirname)).flatMap((p: string) => {
const pathname = join(dirname, p);
const stats = fs.statSync(join(rootDir, pathname));
return stats.isDirectory()
? getPathsInDir(ctx, rootDir, pathname)
: [{ pathname, contentLength: stats.size }];
});
} catch (e) {
ctx.log.debug(e);
throw new Error(invalid({ sourceDir: rootDir } as any, e).output);
}
}
function getOutputDir(buildLog: string) {
const outputString = 'Output directory: ';
const outputIndex = buildLog.lastIndexOf(outputString);
if (outputIndex === -1) return undefined;
const remainingLog = buildLog.slice(outputIndex + outputString.length);
const newlineIndex = remainingLog.indexOf('\n');
const outputDir = newlineIndex === -1 ? remainingLog : remainingLog.slice(0, newlineIndex);
return outputDir.trim();
}
function getFileInfo(ctx: Context, sourceDir: string) {
const lengths = getPathsInDir(ctx, sourceDir).map((o) => ({ ...o, knownAs: slash(o.pathname) }));
const total = lengths.map(({ contentLength }) => contentLength).reduce((a, b) => a + b, 0);
const paths: string[] = [];
let statsPath: string;
// eslint-disable-next-line no-restricted-syntax
for (const { knownAs } of lengths) {
if (knownAs.endsWith('preview-stats.json')) statsPath = knownAs;
else if (!knownAs.endsWith('manager-stats.json')) paths.push(knownAs);
}
return { lengths, paths, statsPath, total };
}

So I think that simply the test is wrong here.

Copy link
Contributor

@yannbf yannbf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This all looks good to me, thanks for the good work @ethriel3695 !

Copy link
Member

@thafryer thafryer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like!! Good work @ethriel3695 !!

Copy link
Member

@tmeasday tmeasday left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couple comments!

bin-src/tasks/upload.ts Outdated Show resolved Hide resolved
@@ -90,6 +91,8 @@ describe('validateFiles', () => {
});

describe('traceChangedFiles', () => {
jest.spyOn(trimStatsFile, 'readStatsFile').mockReturnValueOnce(Promise.resolve(undefined));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How was this file previously mocking the reading of JSON?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It wasn't. Just resolved to undefined and did not break. The tests could be improved if there are use cases that require stats files to contain something though

Copy link
Member

@tmeasday tmeasday left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🙌

@ethriel3695 ethriel3695 merged commit 3e379a2 into main Aug 10, 2022
@ethriel3695 ethriel3695 deleted the reuben/ap-2270-add-trim-stats-file-changes-to-the branch August 10, 2022 02:13
@ghengeveld ghengeveld changed the title Update stats process to use stream Read Webpack stats file as stream to support very large projects Aug 11, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants