Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: support subspace in package-extractor #4655

Merged

Conversation

EscapeB
Copy link
Contributor

@EscapeB EscapeB commented Apr 17, 2024

Summary

Fix issue that can't extract projects dependencies correctly if there exist multiple subspaces.

Details

Now only subspace that for entry project will be passed to package extractor, but if there are multiple projects belongs to different subspaces, dependencies in other subspaces can't be extract correctly.

How it was tested

Tested with in-house production project.

Also there is an example that shows what the file structure will be like after this PR merged.
Here is the example subspace repo https://github.com/william2958/rush-subspace-demo
image

Impacted documentation

The internal API for PackageExtractor has been changed to support multiple subspaces

const queue: RushConfigurationProject[] = [project];

while (queue.length > 0) {
const _project: RushConfigurationProject = queue.shift()!;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did we forget to fill 'projects' in this function?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, deleted by mistake...

@@ -483,8 +499,13 @@ export class PackageExtractor {
path.join(packageJsonRealFolderPath, 'package.json')
);

const targetSubspace: IExtractorSubspace | undefined = subspaces?.find(
(p) => p.pnpmInstallFolder && Path.isUnder(packageJsonFolderPath, p.pnpmInstallFolder)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just curious: would it be possible to have multiple "subspaces" here?

It's related if there are any nested projects in a monorepo (which is not recommended)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so, because each subspace's pnpmInstallFolder will be located at common/temp and they won't be nested right?

We can't have multiple subspaces in following logic, or it will have some issues

One dependency from one subspace might be resolved to another subspace, which is not expected

try {
// The PNPM virtual store links are created in this folder. We will resolve the current package
// from that location and collect any additional links encountered along the way.
// TODO: This can be configured via NPMRC. We should support that.
const pnpmDotFolderPath: string = path.join(pnpmInstallFolder, 'node_modules', '.pnpm');
const pnpmDotFolderPath: string = path.join(
targetSubspace.pnpmInstallFolder!,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we remove the "!" sign here? Instead, adding it to if-condition

if (targetSubspace && targetSubspace.pnpmInstallFolder) {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes

*/
pnpmInstallFolder?: string;
subspaces?: IExtractorSubspace[];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NOTE: It's a breaking change to remove "pnpmInstallFolder" and entirely replace it with "subspaces". If we do so, it's a major version bump.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, to avoid breaking API compatibility we could allow both alternatives: Either you specify IExtractorState.transformPackageJson etc, or else you can specify IExtractorState.subspaces (but not both at the same time).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I will rethink how can we make API more compatible here.

const queue: RushConfigurationProject[] = [project];

while (queue.length > 0) {
const _project: RushConfigurationProject = queue.shift()!;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

const _project: RushConfigurationProject = queue.shift()!;

for (const dependency of _project.dependencyProjects) {
queue.push(dependency);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder visited projects should be ignored.

To illustrate:

const visited: Set<RushConfigurationProject> = new Set();

...

for (const dependency of _project.dependencyProjects) {
  if (visited.has(dependency) {
    continue;
  }
  
  ...

  visited.add(dependency);
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it would have some performance improvement.

if (!scenarioConfiguration.json.omitPnpmWorkaroundLinks) {
subspace.pnpmInstallFolder = project.subspace.getSubspaceTempFolder();
}
subspaces.push(subspace);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assuming A, B lives in the same subspace. It feels like the current logic generates the same subspace twice. Am I understanding this correctly?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch, I might change it to a Set to avoid that case.

@octogonz
Copy link
Collaborator

octogonz commented Apr 17, 2024

How it was tested

Tested with in-house production project.

We don't need an official build test necessarily, but I'd really like to at least see a branch that illustrates how this algorithm is used. If you can't easily reproduce the error being fixed, that's okay, but it would be very helpful to have an example rush install folder that we can run the new algorithm against to see what the deployed output looks like in the context of subspaces.

@william2958 @EscapeB


while (queue.length > 0) {
const _project: RushConfigurationProject = queue.shift()!;

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shift() is an O(n) operation making this loop O(n^2). If N=1,000 then N^2=1,000,000 which might be significant

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch, I think I can simple convert it to be a DFS by replacing shift with pop.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good

@EscapeB
Copy link
Contributor Author

EscapeB commented Apr 18, 2024

How it was tested

Tested with in-house production project.

We don't need an official build test necessarily, but I'd really like to at least see a branch that illustrates how this algorithm is used. If you can't easily reproduce the error being fixed, that's okay, but it would be very helpful to have an example rush install folder that we can run the new algorithm against to see what the deployed output looks like in the context of subspaces.

@william2958 @EscapeB

@octogonz I have added the example rush repo and a screenshot of deployment output.

@EscapeB EscapeB force-pushed the fix/support_subspace_in_package_extractor branch from 40edee8 to 7228c25 Compare April 18, 2024 04:37
@octogonz
Copy link
Collaborator

octogonz commented Apr 18, 2024

How it was tested

Tested with in-house production project.

We don't need an official build test necessarily, but I'd really like to at least see a branch that illustrates how this algorithm is used. If you can't easily reproduce the error being fixed, that's okay, but it would be very helpful to have an example rush install folder that we can run the new algorithm against to see what the deployed output looks like in the context of subspaces.
@william2958 @EscapeB

@octogonz I have added the example rush repo and a screenshot of deployment output.

William also created a test branch for us here:
https://github.com/william2958/rush-subspace-demo

I had a long day today at the conference, so I will test it first thing tomorrow morning. Mainly I want to inspect the symlinks created by rush deploy to confirm that they look correct and aren't including extraneous files. (It's hard to judge from only looking at the code.)

@EscapeB
Copy link
Contributor Author

EscapeB commented Apr 18, 2024

How it was tested

Tested with in-house production project.

We don't need an official build test necessarily, but I'd really like to at least see a branch that illustrates how this algorithm is used. If you can't easily reproduce the error being fixed, that's okay, but it would be very helpful to have an example rush install folder that we can run the new algorithm against to see what the deployed output looks like in the context of subspaces.
@william2958 @EscapeB

@octogonz I have added the example rush repo and a screenshot of deployment output.

William also created a test branch for us here: https://github.com/william2958/rush-subspace-demo

I had a long day today at the conference, so I will test it first thing tomorrow morning. Mainly I want to inspect the symlinks created by rush deploy to confirm that they look correct and aren't including extraneous files. (It's hard to judge from only looking at the code.)

Yep, I'm also using the repo William created, I put its link in this PR description.

@@ -483,8 +499,13 @@ export class PackageExtractor {
path.join(packageJsonRealFolderPath, 'package.json')
);

const targetSubspace: IExtractorSubspace | undefined = subspaces?.find(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This find() is calling Path.isUnder() for every subspace (including paths that don't match), for every node_modules package that gets crawled. If we have n=1000 subspaces and n=1000 installed folders, that is O(n*m)=O(1,000,000) operations.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's a couple examples of how Rush handles these problems:

@EscapeB @william2958 Because PR #4655 is an important fix, we don't need to address the problem in this PR. But I think we should consider optimizing these lookups because O(n^2) loops can and do cause noticeable slowness.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK,will try to improve and that in later PR.

// Transform packageJson using the provided transformer, if requested
const packageJson: IPackageJson = transformPackageJson?.(originalPackageJson) ?? originalPackageJson;
const packageJson: IPackageJson =
targetSubspace?.transformPackageJson?.(originalPackageJson) ?? originalPackageJson;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This API design is confusing for users because it ignores mistakes. In the below example, what is the pnpmInstallFolder for the default subspace? It is unclear.

this.extractAsync({
  pnpmInstallFolder: 'abc',
  subspaces: [
    { subspaceName: 'default', pnpmInstallFolder: 'def' },
    { subspaceName:'other', pnpmInstallFolder: 'ghi' }
  ]
})

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I'm working on a fix)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

@@ -36,6 +36,7 @@ export interface IExtractorOptions {
pnpmInstallFolder?: string;
projectConfigurations: IExtractorProjectConfiguration[];
sourceRootFolder: string;
subspaces?: IExtractorSubspace[];
targetRootFolder: string;
terminal: ITerminal;
transformPackageJson?: (packageJson: IPackageJson) => IPackageJson | undefined;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
transformPackageJson?: (packageJson: IPackageJson) => IPackageJson | undefined;
transformPackageJson?: (packageJson: IPackageJson) => IPackageJson;

I think this was just typed incorrectly. I don't see any usage of this transform that could reasonably need the method to return "undefined".

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. I checked and PNPM itself seems to reject that case:

if (pnpmfile?.hooks?.readPackage) {
  const readPackage = pnpmfile.hooks.readPackage;
  pnpmfile.hooks.readPackage = async function(pkg, ...args2) {
    pkg.dependencies = pkg.dependencies ?? {};
    pkg.devDependencies = pkg.devDependencies ?? {};
    pkg.optionalDependencies = pkg.optionalDependencies ?? {};
    pkg.peerDependencies = pkg.peerDependencies ?? {};
    const newPkg = await readPackage(pkg, ...args2);
    if (!newPkg) {
      throw new BadReadPackageHookError(pnpmFilePath, "readPackage hook did not return a package manifest object.");
    }

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

Comment on lines 238 to 239
* Can't be used with subspaces, if subspaces field is specified, we will use pnpmInstallFolder configured
* for each subspace and ignore this.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update this to throw if both are provided instead.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in my EscapeB#1

Comment on lines 524 to 526
targetSubspace?.transformPackageJson?.(originalPackageJson) ??
transformPackageJson?.(originalPackageJson) ??
originalPackageJson;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should validate that only one of the two options (subspaces vs top-level transformPackageJson/pnpmInstallFolder) is provided, and throw an error if it's not the case. We wouldn't want to hide why the top-level ones are ignored when someone specified subspaces

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in my EscapeB#1

@octogonz octogonz merged commit 25b89ec into microsoft:main Apr 18, 2024
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

None yet

5 participants