Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question: multiworkspace monorepo #6483

Open
SebastianBogado opened this issue Oct 3, 2018 · 11 comments
Open

Question: multiworkspace monorepo #6483

SebastianBogado opened this issue Oct 3, 2018 · 11 comments
Assignees
Labels

Comments

@SebastianBogado
Copy link

SebastianBogado commented Oct 3, 2018

Before I continue, this is a question. If there's a better place to post this, please let me know :)

So the docs state:

Workspaces must be descendants of the workspace root in terms of folder hierarchy. You cannot and must not reference a workspace that is located outside of this filesystem hierarchy.

Fine, let's get to an example. For this hierarchy:

.
├── applications
│   └── awesome-app
│       ├── comp-service1
│       └── package.json  // here goes the workspace config
│   └── another-awesome-app
│       ├── comp-service2
│       └── package.json  // here goes the workspace config
├── common
│   └── foo
├── lerna.json
└── package.json

with config like:

  "workspaces": [
    "comp-service1",
    "../../common/*"
  ] 

And comp-service1 depending on foo.

I know that when running from inside awesome-app workspace $ node comp-service1/index it won't work properly because foo won't be able to find its dependencies and the module was installed under applications/awesome-app/node_modules.

BUT when running with $ NODE_PATH=$(pwd)/node_modules node comp-service1/index, Node's resolution mechanism works perfectly.

All in all, my question is: is there any other reason why we shouldn't have workspaces outside of the hierarchy? Am I missing something?

Thanks!

@ghost ghost assigned arcanis Oct 3, 2018
@ghost ghost added the triaged label Oct 3, 2018
@arcanis
Copy link
Member

arcanis commented Oct 3, 2018

Mostly because anything within common won't be aware they belong to another workspace (or worse, multiple ones). Meaning that when you run yarn there (for example yarn add, or similar) it won't be able to detect your top-level package.json, or yarn.lock, or .pnp.js.

You should instead create a package.json in . that would have the following workspaces:

{
  "workspaces": [
    "applications/*",
    "common/*"
  ]
}

@SebastianBogado
Copy link
Author

Thanks for the quick answer!

I forgot to mention that each app has many packages in it, let's say a server and a frontend app.
And we've had bad experiencies with the workspace defined at the root :/, so now we're looking for an intermediate solution between root node_modules hoisting and not having hoisting at all, while keeping the linking between packages in the monorepo.
So we're looking for something between yarn workspaces and lerna.

Yes, common could belong to many workspaces, which helps with the linking and half-way hoisting part. Even we could talk about having a workspace at common itself for using yarn commands and speed up installs.

I know that this incurs on installing more than once the same dependencies across all workspaces that have common packages in it, but still, I can live with that. It's better than not having hoisting at all :)

Any other thing that may break?

@arcanis
Copy link
Member

arcanis commented Oct 4, 2018

Any other thing that may break?

Who knows. It's an undefined behavior we don't officially support, so in theory it could behave differently in later versions. We could start enforcing the limitation, for example 🙂

In practice it should be okayish, but still not great :( Would you mind sharing the issues you got with a workspace defined at the root?

@SebastianBogado
Copy link
Author

SebastianBogado commented Oct 4, 2018

We could start enforcing the limitation

Please don't lol

Okay, so, grab a coffee ☕️ 😛

The issue with workspace defined at the root comes when your packages are using different versions of a dependency. In our monorepo we have four apps, each with their packages, and some common packages. In total, we have around 50 publishable modules.

Simplified example
To simplify the issue, say you have you have just three apps. Two of them depend on react@15 and the other on react@16. All depend on react-redux@5, which works with every version of React. So yarn will hoist react@16 and react-redux@5 and you get:

.
├── applications
│   ├── my-first-app
│   │   └── frontend
│   ├── my-second-app
│   │   └── frontend
│   └── my-third-app
│       └── frontend
│           └── node_modules
│               └── react@16.4.1
└── node_modules
    ├── react@15.6.2
    └── react-redux@5.0.7

First issue
Here's the first issue we found: when my-third-app imports react-redux, it will load the one at the root. And when react-redux imports react, it will get its sibling, react@15. But the app already loaded react@16. So you end up loading two different versions of React :/

Back to the real life situation, we found this to be problematic with react, react-dom and styled-components. The app wasn't working properly: debugging we found that some pieces of code ran on react@16 and other were running on react@15. Also the styles weren't being applied (styled-components explicitly says that it won't work when you have more than one version loaded). And last but not least, our snapshot tests, with jest-styled-components, were broken too for the same reason.
And it wasn't only react-redux that would work with more than one react version. We found this with other dependencies, some of them transitive.

To workaround this we had to specify nohoist with carefully picked libraries on each package. Some of those were transitive dependencies, so we had to add them as a direct dependency for the nohoist to work.

Second issue
And then we ran into the next issue: yarn was installing a copy of the hoisted version of a package inside the one being nohoisted:

.
├── applications
│   ├── my-first-app
│   │   └── frontend
│   ├── my-second-app
│   │   └── frontend
│   └── my-third-app
│       └── frontend
│           └── node_modules
│               ├── react@16.4.1
│               └── react-redux@5.0.7
│                   └── node_modules
│                       └── react@15.6.2
└── node_modules
    ├── react@15.6.2
    └── react-redux@5.0.7

Details around this and repo to reproduce can be found here #5978 . It has nothing to do with lerna as it was also reported here #6206 .

To workaround this, we added postinstall scripts that fixed the dependencies. For example, rm -rf node_modules/react-redux/node_modules/react.

Third issue
The third issue is the state of our monorepo. Our solution is fragile.

You don't know what's going to happen whenever you add a dependencies or a new entire module. This may affect the voting mechanism and change the hoisted versions of any library, not only the one you just added (because of transitive dependencies).

If we were to have more apps with react@16 and it ends up hoisting at the root, the situation will flip around. We would need to remove the nohoists and postinstall scripts from the apps with issues, and add it on the ones that nowadays are working properly.

Finally, in a monorepo with thousands of dependencies (including transitives, of course), these dependencies issues are really hard to debug, and also it's time-consuming tweaking around configs and doing clean installs to fix them.

This is not ideal and that's why we're looking into a better solution. We don't want to go back to lerna because the bootstrap step was around 30 minutes (no kidding!). We love the speed up yarn provides (down to 5 minutes), and thanks for that! But it turned out a double-edged sword for our use case and now we're trying to work out something in between.
We also considered splitting the monorepo into same stack apps but we decided against it.

I'm open to other ideas :)

Thanks for reading!

@arcanis
Copy link
Member

arcanis commented Oct 4, 2018

Thanks for your post, and sorry for the bad experience! I think I have a solution that you might be interested into - check the end of my post for more info.

First issue
Here's the first issue we found: when my-third-app imports react-redux, it will load the one at the root. And when react-redux imports react, it will get its sibling, react@15. But the app already loaded react@16. So you end up loading two different versions of React :/

That's interesting. It's definitely a bug. react-redux should, by contract, be guaranteed to obtain the exact same version of react than its parent. The right behavior would be to duplicate react-redux once for each workspace.

Second issue
And then we ran into the next issue: yarn was installing a copy of the hoisted version of a package inside the one being nohoisted:

This is clearly a bug as well. The nohoist feature is trickier to get right because Yarn hasn't been designed to make the hoisting mechanism pluggable, so it's been implemented on top after the fact, and there's some edge cases that don't work well with it 🙁


Ok, so the first good thing is that the core issue you had was caused by bugs. The first one in particular looks annoying, we would very much welcome a patch to prevent react-redux from being hoisted if it contains a peer dependency!

The second good thing is related to your third issue:

You don't know what's going to happen whenever you add a dependencies or a new entire module. This may affect the voting mechanism and change the hoisted versions of any library, not only the one you just added (because of transitive dependencies).

I don't know if you've seen it, but starting from the 1.12 Yarn ships with a new installation strategy called Plug'n'Play. In this mode Yarn doesn't create the node_modules folder anymore, and directly uses the Yarn cache to resolve the packages. Amongst other thing, it also entirely removes the need for any hoisting. Thanks to this, packages with peer dependencies are guaranteed to get the right result, whatever happens. There can still be bugs, of course, but there's a much smaller surface for such problems to arise.

Would you be interested into giving it a try? It might depend on your package list (some packages like Angular and TypeScript are known to have some compatibility problems at the moment), but overall it's stable enough that our teams have been using it for more than a month now.

@SebastianBogado
Copy link
Author

Oh, so both were bugs. I thought the first one was expected behavior. Should have asked :P

Regarding Plug'n'Play, I've heard something similar from npm. And also other package managers (besides JS) go in that direction.

Thanks for the heads-up, I'll give it a try!

@SokratisVidros
Copy link

@SebastianBogado do you have any updates regarding yarn pnp? Did it solve any of the aforementioned problems? We are having a similar setup and we are facing the same issue regarding dependencies hoisting.

@SebastianBogado
Copy link
Author

Hi @SokratisVidros . We went with a custom solution in the end, that simulates that there's only one app in the monorepo. So yarn only sees that app submodules (webapp, integration tests, composition service, etc) + the common packages that it needs.

Why yarn pnp wouldn't work for us? It would still be installing all the dependencies (hundreds of thousands), which is way time consuming. Maybe on local development wouldn't be much of an issue due to the cache, but in our pipelines we run installs from scratch and it was not acceptable.

I suggest you try something similar, it's not complex and now our monorepo scales as expected. We have 20 apps and counting (each with their own submodules); we have 50-ish common packages, shared between apps; and five teams of 5 engineers in average contributing to the monorepo daily. And it works 🌈

@SokratisVidros
Copy link

Thanks for the help @SebastianBogado. Could you please elaborate on the single app approach? Does that mean the yarn is local to every app in the monorepo resulting in a comprehensive package.json thay includes all dependencies?

@SebastianBogado
Copy link
Author

@SokratisVidros basically, we run a JS script that changes the workspaces definition of the root package.json (with proper git commands so that we don't commit those changes). So for instance, this would be the standard workspaces definition:

  "workspaces": {
    "packages": [
      "common/**",
      "applications/**"
    ]
  }

And after running the script:

  "workspaces": {
    "packages": [
      "common/telemetry-middleware",
      "common/ui-components",
      "common/react-session-handler",
      // ...
      "applications/my-awesome-app/webapp",
      "applications/my-awesome-app/comp-service",
      "applications/my-awesome-app/integration-tests",
      "applications/my-awesome-app"
    ]
  },

We also have many lockfiles, one for each app, and the script creates a symlink to the current app lockfile.

@SokratisVidros
Copy link

That makes sense. Thanks a lot @SebastianBogado

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants