Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Discussion-1226: Mill coursier config #2954

Draft
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

jackcviers
Copy link

This commit adds support for using the coursier default config files as a configuration file for coursier resolution.

It addresses the following concerns from
#1226:

  • The need for a mill config file.

    Coursier already has a default configuration file. Currently, in
    2.1.8, this is the scala-cli config
    file
    ).

    This file is accessible via CoursierPaths.scalaConfigFile. It is
    overridable, in a heirarchy starting with the system environment
    variable SCALA_CLI_CONFIG, then via the jvm system property
    scala-cli.config. It falls back to an OS-specific location (mac
    os: ~/Library/Application Support/ScalaCli/secrets/config.json, n linux: ~/.config/scala-cli/secrets/config.json, windows:
    %LOCALAPPDATA%\ScalaCli\data\secrets\config.json)

    The file allows for additional, unknown keys, which we could
    prefix with mill.

      Each key is allowed a prefix (it's root level json key), a
      type (see
      [Keys.scala](https://github.com/VirtusLab/scala-cli/blob/main/modules/config/src/main/scala/scala/cli/config/Keys.scala)
      and
      [Key.scala](https://github.com/VirtusLab/scala-cli/blob/main/modules/config/src/main/scala/scala/cli/config/Key.scala)),
      and a parse and write function to provide for handling any format
      of the key values.
    
  • The need to handle setting up proxies for dependency resolution via a config file.

    See scala-cli
    config
    .

    In the config.json, httpProxy.address contains the proxy
    address, httpProxy.user contains the proxy user, which can be fetched via env:SOME_ENV_VAR, value:somerawvalue, or even through a delegated command, like to vault or gpg.

  • The need to handle setting up repository mirrors.

    See See scala-cli
    config
    .

    In the config.json,
    repositories.mirrors sets up mirrors. They take the form of an array of tree:dest=source.

  • The need to handle credentials for private repositories.

    See See scala-cli
    config
    .

    In the config.json,
    repositories.defaults contains the list of default repositories to use. There are many formats for shared repositories, local repositories, and private repositories (ivy repos begin with ivy:, and contain their patterns, default type is a maven repository).

    The formats are supposed to handle user information for
    authentication within the url's user and password (user:password@url) for authentication, but in practice this is not the case with coursier. This commit uses the credentials as applied to the default coursier FileCache to map onto the appropriate repositories that match by host in CoursierModule.repositoriesTask. I have confirmed that this works against my own private repository.

Additional mill config needs can be met by adding a Key, using ConfigDb to fetch the key as in
PlatformResolve.confFileRepositories, and by adding the json encoded value to the default config file. The example parsers that handle secrets via environment, raw value, or commands are good examples of how to handle anything that may need to be secret within a company. Within the necessary utilities or mill task implementations, we can read the default scala config for the value that we need.

I confirmed that the current values work via unit test in CoursierMirrorTests.

This change should be backwards-binary-compatible and backwards-source-compatible. I believe old mill users will not be affected by this enhancement.

@jackcviers jackcviers force-pushed the discussion-1226-mill-config-files-use-coursier-configs branch from 0368d79 to a45d675 Compare January 8, 2024 18:30
@jackcviers jackcviers marked this pull request as ready for review January 8, 2024 20:55
Copy link
Member

@lefou lefou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's good to better support coursier setup via it's own configuration files.

When it comes to Mill's own configuration and user preferences, I don't think we should piggy-back on coursier configuration, even if it is meant to be extensible.

Comment on lines +48 to +54
// we can't use sys.props to isolate these. It will use the first
// value of any prop set in the props in any test. This will
// persist beyond the lifetime of the test to the other tests in
// the task run. I believe we would need to fork the jvm for each
// test in order to achieve isolation, but this would be
// costly. If anyone has ideas of how to better test these in
// isolation, please provide a PR.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is better tested in an integration test. We already have integration tests setup under integration.

Comment on lines 71 to 89
Resolve.proxySetup()
val resolvedCredentials = coursier.cache.CacheDefaults.credentials.flatMap { _.get() }
val repos = Await.result(
Resolve().finalRepositories.future(),
Resolve().finalRepositories.map {
_.map {
case x: IvyRepository =>
x.withAuthentication(
resolvedCredentials.find { c =>
c.matches(x.pattern.string, c.usernameOpt.getOrElse(""))
}.map(_.authentication)
)
case x: MavenRepository =>
x.withAuthentication(
resolvedCredentials.find { c =>
c.matches(x.root, c.usernameOpt.getOrElse(""))
}.map(_.authentication)
)
}
}.future(),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How costly it this setup? Since it's implemented in an annonymous task, it's not cached and evaluated in every module again and again. We should consider an input target in a shared worker module to only read this configuration once.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe, we should also make sure to not override already present authentication?

Copy link
Author

@jackcviers jackcviers Feb 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How costly it this setup? Since it's implemented in an annonymous task, it's not cached and evaluated in every module again and again. We should consider an input target in a shared worker module to only read this configuration once.

Addressed in 87b90ee

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe, we should also make sure to not override already present authentication?

Addressed in 504233e

@jackcviers
Copy link
Author

jackcviers commented Jan 27, 2024

That's OK, we can define our own config file for other options.

I don't like that a default set for the cousier cli filters through to users of the coursier library as a default, but now it exists and would be a breaking change to prevent its use.

All this PR does is wire up the credentials from the coursier config. If we support using the config, then we need to support using the creds in it for resolution. It's probably confusing to users if something resolves correctly when using coursier from the command-line, scala-cli from the command line, and then fails to resolve in mill or sbt or other projects using coursier as a library for resolution because their default internal repositories don't have the credentials wired in from the config for like cousier cli does.

The mechanism for accomplishing that doesn't have to be like in this pr. I think just using the default file cache in coursier support would do it, too.

But I do think the way creds are applied in coursier modules should align with coursier cli if possible. And mirrors. And proxies, etc.

@lefou lefou marked this pull request as draft January 29, 2024 11:47
jackcviers added a commit to jackcviers/mill that referenced this pull request Feb 5, 2024
jackcviers added a commit to jackcviers/mill that referenced this pull request Feb 5, 2024
@jackcviers
Copy link
Author

jackcviers commented Feb 5, 2024

@lefou I

  • Moved the fetching of default repositories, credentials, mirrors, and proxy configuration; and the running of proxy setup to CoursierWorkerModule
  • Added guards to check for existing authentication on the repositories before applying the credentials from coursier
  • Added links to the existing coursier implementations in doc comments
  • Delegated the repositoriesTask to the CoursierWorkerModule
  • Brought the branch UTD with master

This way, no new members are added to CoursierModule to interfere with currently available integrations. If we'd prefer to move them back, let me know.

The integration tests are yet TODO.

I'm a little lost with the compile errors. I guess I need to move the impls back into CoursierModule for now. However, when I do that I am missing writers for the credentials in the workers:

// in CoursierModule
  /**
   * Gets the coursier credentials from coursier's configuration sources.
   *
   * @see [[https://github.com/coursier/coursier/blob/f6f82f2e7b55deb461dd041f8f11df428bcf3789/modules/cache/jvm/src/main/scala/coursier/cache/CacheDefaults.scala#L97] CacheDefaults.credentials]
   * @see [[https://get-coursier.io/docs/other-credentials] Coursier Credentials]
   * @see [[https://scala-cli.virtuslab.org/docs/guides/power/repositories/#repository-authentication] Scala-cli Repository Authentication]
   * @return the list of coursier credentials
   */
  def coursierCredentials(implicit ctx: Ctx): Worker[Seq[Credentials]] = T.worker {
    T.input {
      T.task {
        CacheDefaults.credentials.flatMap { _.get() }
      }.evaluate(ctx)
    }.t
  }
  // produces ---
  // [#07] [error] /Users/jackcviers/Documents/development/mill/scalalib/src/mill/scalalib/CoursierModule.scala: 
  // 60:13: could not find implicit value for parameter w: upickle.default.Writer[Seq[coursier.credentials.DirectCredentials]]
  // [#07] [error]     T.input {

see latest commit. Help is appreciated.

This commit adds support for using the coursier default config files
as a configuration file for coursier resolution.

It addresses the following concerns from
com-lihaoyi#1226:

- The need for a mill config file.

    Coursier already has a default configuration file. Currently, in
    2.1.8, this is the [scala-cli config
    file](https://scala-cli.virtuslab.org/docs/commands/config/#:~:text=Configuration%20values%20are,config.json)).

    This file is accessible via `CoursierPaths.scalaConfigFile`. It is
    overridable, in a heirarchy starting with the system environment
    variable `SCALA_CLI_CONFIG`, then via the jvm system property
    `scala-cli.config`. It falls back to an OS-specific location (mac
		os: `~/Library/Application Support/ScalaCli/secrets/config.json`,
n		linux: `~/.config/scala-cli/secrets/config.json`, windows:
		`%LOCALAPPDATA%\ScalaCli\data\secrets\config.json`)

    The file allows for additional, unknown keys, which we could
		prefix with `mill`.

		Each key is allowed a prefix (it's root level json key), a
		type (see
		[Keys.scala](https://github.com/VirtusLab/scala-cli/blob/main/modules/config/src/main/scala/scala/cli/config/Keys.scala)
		and
		[Key.scala](https://github.com/VirtusLab/scala-cli/blob/main/modules/config/src/main/scala/scala/cli/config/Key.scala)),
		and a parse and write function to provide for handling any format
		of the key values.

- The need to handle setting up proxies for dependency resolution via
a config file.

    See [scala-cli
		config](https://scala-cli.virtuslab.org/docs/commands/config/).

    In the `config.json`, `httpProxy.address` contains the proxy
		address, `httpProxy.user` contains the proxy user, which can be
		fetched via `env:SOME_ENV_VAR`, `value:somerawvalue`, or even
		through a delegated command, like to vault or gpg.

- The need to handle setting up repository mirrors.

    See See [scala-cli
		config](https://scala-cli.virtuslab.org/docs/commands/config/).

    In the `config.json`,
		[`repositories.mirrors`](https://scala-cli.virtuslab.org/docs/reference/commands#config:~:text=repositories.mirrors%20Repository%20mirrors%2C%20syntax%3A%20repositories.mirrors%20maven%3A*%3Dhttps%3A//repository.company.com/maven)
		sets up mirrors. They take the form of an array of
		`tree:dest=source`.

- The need to handle credentials for private repositories.

    See See [scala-cli
		config](https://scala-cli.virtuslab.org/docs/commands/config/).

    In the `config.json`,
		[`repositories.defaults`](https://scala-cli.virtuslab.org/docs/reference/commands#config:~:text=repositories.default%20Default%20repository%2C%20syntax%3A%20https%3A//first%2Drepo.company.com%20https%3A//second%2Drepo.company.com)
		contains the list of default repositories to use. There are many
		formats for [shared
		repositories](https://github.com/coursier/coursier/blob/main/modules/coursier/shared/src/main/scala/coursier/internal/SharedRepositoryParser.scala),
		[local
		repositories](https://github.com/coursier/coursier/blob/1539a6a2ba4642c7cfe4699c3d8252a864d84965/modules/coursier/jvm/src/main/scala/coursier/internal/PlatformRepositoryParser.scala#L18-L23),
		and private repositories (ivy repos begin with `ivy:`, and contain
		their patterns, default type is a maven repository).

    The formats are supposed to handle user information for
		authentication within the url's user and
		password (user:password@url) for authentication, but in practice
		this is not the case with coursier. This commit uses the
		credentials as applied to the default coursier `FileCache` to map
		onto the appropriate repositories that match by host in
		`CoursierModule.repositoriesTask`. I have confirmed that this
		works against my own private repository.

Additional mill config needs can be met by adding a `Key`, using
`ConfigDb` to fetch the key as in
`PlatformResolve.confFileRepositories`, and by adding the json encoded
value to the default config file. The example parsers that handle
secrets via environment, raw value, or commands are good examples of
how to handle anything that may need to be secret within a
company. Within the necessary utilities or mill task implementations,
we can read the default scala config for the value that we need.

I confirmed that the current values work via unit test in
`CoursierMirrorTests`.

This change should be backwards-binary-compatible and
backwards-source-compatible. I believe old mill users will not be
affected by this enhancement.
@jackcviers jackcviers force-pushed the discussion-1226-mill-config-files-use-coursier-configs branch from 504233e to f8c6bf7 Compare February 5, 2024 04:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants