New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Filter all autogen modules / c-sources #3656

Closed
fmaste opened this Issue Aug 1, 2016 · 31 comments

Comments

Projects
None yet
4 participants
@fmaste
Copy link
Collaborator

fmaste commented Aug 1, 2016

Currently when calling sdist the autogenerated module Paths_* is ignored, this is a problem for packages with built type custom that generate other modules like this one because you get "Error: Could not find module: ..."

My quick fix was just not using sdist because I don't need to, not really an elegant solution but it worked until I added my package as source to a sandbox and discovered that when building on the sandbox sdist is called on my package!

My proposed solution is to list all modules on the autogen directory and filter them from exposed-modules and other-modules, not just Paths_* like function filterAutogenModule on Distribution.Simple.SrcDist does. This is what I'm actually doing with a hook on my setup script.

@ezyang

This comment has been minimized.

Copy link
Contributor

ezyang commented Aug 2, 2016

First, let's talk workarounds. I can think of a few:

  1. Give cabal new-build a try instead of sandboxes (http://blog.ezyang.com/2016/05/announcing-cabal-new-build-nix-style-local-builds/); I am not sure why sandbox calls sdist eagerly, but new-build should not call it until after we've finished building (in both cases, we call sdist to figure out which files to monitor to handle recompilation avoidance). This might not work but it's worth a try.
  2. You're using a Custom script to generate more files, right? You may be able to nub out the list of other-modules/exposed-modules before they get passed to sdist in the sdist hook; that would also "solve" the problem.

OK, how about properly fixing this? First off, your proposed solution doesn't completely work, because if sdist is called before you've actually run a build, there won't be anything in autogen, and so we'll incorrectly conclude that they are missing. Furthermore, if they do exist, we DO want to rebuild if the autogenerated file changes.

What's the right way to fix this? I don't know. Here are some possibilities:

  1. Don't worry too much about the sdist from clean case and implement your workaround.
  2. Add a new field to the Cabal description stating what modules are autogenerated (for BC, it wouldn't be an error to repeat them in other-modules/exposed-modules, I suppose). Then sdist would just skip over those.
  3. If workaround (2) didn't work, introduce a more flexible mechanism by which we can modify the PackageDescription; some sort of generalized HookedBuildInfo
  4. Add a new flag to sdist saying what module names it can ignore. Then a Custom script can just feed those in.
  5. Fix #3401, and then the new list-sources command WOULD properly find and include the autogenerated modules, and then recompilation avoidance would work correctly when an autogenerated module changes. (But you still can't sdist...)

I like (5) but it requires a (very useful) yak to be shaved first.

@fmaste

This comment has been minimized.

Copy link
Collaborator

fmaste commented Aug 2, 2016

I'm doing workaround 2 ("nub out the list of other-modules/exposed-modules before they get passed to sdist in the sdist hook"), my autogen modules are built with the postConf hook just to be sure that they are there through the whole process that I don' completely understand and at least no problems so far.

About a proper fix only 2 and 3 could work for all the cases I can think of, with 1 and 5 a clean sdist will still fail. With 4 if for example if we want to add a cabal hlint command that works without running configure it will still fail or a new flag is needed with much more hooks.

  1. Right now Paths_PACKAGENAME is hardwired, we can treat all modules named *_PACKAGENAME the same way. Really an easy-but-no-so-elegant fix. From the user side the only problem is if someone wants to build a complex hierarchy of autogen modules, it would have to be flat.

An combination of 2 and 3 could work:
An autogen-modules field could be added next to exposed-modules and other-modules. In my situation a readDesc hook would be in charge of adding this, Paths_PACKAGENAME would be added automagically to be backwards compatible and function prepareTree in Distribution.Simple.SrcDist changed to return exposed-modules + other-modules - autogen-modules.

@ezyang

This comment has been minimized.

Copy link
Contributor

ezyang commented Aug 2, 2016

(6) is an idea. Rather than require _PACKAGENAME, we could treat the Autogen prefix specially and filter out all those modules. It is similar to (2) but you don't have to add another field. It is a case of convention over configuration; I am not sure how I feel about it (maybe a bit too implicit.)

I think autogen-modules is simple and explicit. But I wonder what other people think.

@ekmett

This comment has been minimized.

Copy link
Member

ekmett commented Aug 3, 2016

If you're looking at autogen-modules then things like autogen-c-sources and the ilk would also be nice to have.

@23Skidoo

This comment has been minimized.

Copy link
Member

23Skidoo commented Aug 3, 2016

Related: #719, #1046.

@fmaste fmaste self-assigned this Aug 9, 2016

@fmaste fmaste closed this in #3670 Aug 11, 2016

fmaste added a commit that referenced this issue Aug 11, 2016

Merge pull request #3670 from fmaste/autogen-modules
Add new 'autogen-modules' field. This closes #3656 and  #719
@ezyang

This comment has been minimized.

Copy link
Contributor

ezyang commented Aug 11, 2016

Let's keep this open, because @ekmett requested autogen-c-sources as well. @fmaste would you be interested in taking a whack at that?

@ezyang ezyang reopened this Aug 11, 2016

@ezyang ezyang changed the title Filter all autogen modules Filter all autogen modules / c-sources Aug 11, 2016

@fmaste

This comment has been minimized.

Copy link
Collaborator

fmaste commented Aug 12, 2016

OK, I'll start making some cleanup on SrcDist before this.

What fields are autogen-c-sources and the ilk ? js-sources and what else ?

@23Skidoo

This comment has been minimized.

Copy link
Member

23Skidoo commented Aug 12, 2016

I believe it's only js-sources, though in future it'd be nice to support more languages (so we'd have cpp-sources, obj-c-sources, rust-sources, etc.).

@fmaste

This comment has been minimized.

Copy link
Collaborator

fmaste commented Aug 12, 2016

So, maybe a global autogen stanza is needed where we can put c-sources, js-sources, etc if the idea is to have many more

@ezyang

This comment has been minimized.

Copy link
Contributor

ezyang commented Aug 12, 2016

A new stanza wouldn't really fit with what stanzas are currently used for (which is to specify new components in Cabal.) I think that is not a yak that should be shaved right now.

@fmaste

This comment has been minimized.

Copy link
Collaborator

fmaste commented Aug 12, 2016

So, maybe a global autogen stanza is needed where we can put c-sources, js-sources, etc if the idea is to have many more. And why not move autogen-modules there

@23Skidoo

This comment has been minimized.

Copy link
Member

23Skidoo commented Aug 13, 2016

If you add a top-level autogen stanza, it'll be no longer possible to find out which component some particular autogen-files belong to.

@fmaste

This comment has been minimized.

Copy link
Collaborator

fmaste commented Aug 13, 2016

Yes, putting the c-sources as usual under a library, exe, test or bench stanza and the ones that are autogenerated are repeated on autogen

autogen
   c-sources: cbits/foobaz.c

library
   c-sources: cbits/foobar.c, cbits/foobaz.c

c-sources and js-sources are global, don't have include directories

@23Skidoo

This comment has been minimized.

Copy link
Member

23Skidoo commented Aug 13, 2016

It makes less sense to repeat the file name both in autogen-c-sources and c-sources, unlike the exposed-modules/other-modules case, where it's not obvious which one the generated module should belong to.

@fmaste

This comment has been minimized.

Copy link
Collaborator

fmaste commented Aug 13, 2016

Yes, but it will be weird having autogen-modules repeat and autogen-c-sources not, mostly difficult to understand. Having all on an autogen stanza is really verbose but easier to read, the definition of the stanza is just "the sources that don't come with the package".
I like the idea of beign explicit here (easier to check the intended behaviour even with hooks) but scares me having to repeat conditionals on a package description.
If we are going to include more *-sources on the future it will be that field and autogen-*-sources for every addition and I prefer the same field but twice, once on the component and once on autogen.

@ezyang

This comment has been minimized.

Copy link
Contributor

ezyang commented Aug 13, 2016

I definitely agree it's not ideal to have to create a field for every type of autogenerated source file. Maybe we can come up with some alternative ideas.

One possibility is to have autogen-files as a catch all for all autogenerated files. We didn't want to do this for modules because modules go through a search path so it's not obvious where the module lives, but for other types of files there's no such search path, so we can just stick them all together, since they don't have include directories.

@fmaste

This comment has been minimized.

Copy link
Collaborator

fmaste commented Aug 13, 2016

We can assume (assumption is the mother of all fu**ups) that a module can't be repeated inside a package for different search paths if thats no already part of what we expect, or start to enforce it, and include modules on this new catch all stanza.
It should be bad practice to have the same module twice on a package and we add a unique place to say "don't search for this modules".

@ezyang

This comment has been minimized.

Copy link
Contributor

ezyang commented Aug 13, 2016

We can assume (assumption is the mother of all fu**ups) that a module can't be repeated inside a package for different search paths if thats no already part of what we expect, or start to enforce it

To be precise, a module can appear multiple times in the search path, but we only take the first occurrence of the module.

include modules on this new catch all stanza.

I don't think this will work for modules.

Consider what listPackageSources does, it outputs a list of source file paths which are in the package, based on fields like cSources and jsSources. What the proposed autogen-files field would do is FILTER out any source file which matches a file path that is being output. So if I write c-sources: foo.c and then autogen-files: foo.c, ordinarily foo.c would be treated as a regular source file, but since foo.c is in autogen-files we exclude it. This works uniformly across c-sources, js-sources, and everything that does NOT use the search path.

Now let's consider the situation with Haskell source files, which we search for in the include path. Let's say we have hs-source-dirs: src, alt, and we have an exposed-module: Paths_foo. What file path would we put in autogen-files to exclude Paths_foo? Since this is an autogenerated module, it lives in NEITHER src/Paths_foo.hs nor alt/Paths_foo.hs. So there isn't any file name to filter out; if you look at allSourcesBuildInfo, you can see that the failure happens because we can't find the module name at all.

So it still seems useful to have a separate autogen-modules from autogen-files. The rule is that for any field we add to a Cabal file for which the name of the file is not known a priori (because it is computed from a module name, or searched for), you will need to make a new field for it. But for everything else (anything you just append on to the output of allSourcesBuildInfo, those can all go in autogen-files). What do people think?

@fmaste

This comment has been minimized.

Copy link
Collaborator

fmaste commented Aug 13, 2016

I'm talking about filtering modules by ModuleName not by FilePath, and only filter the other sources (*-sources) by FilePath

autogen
    modules:
        A.B.C
        Paths_package
    c-sources:
        cbits/foobaz.c

library
    exposed-modules:
        A.B.C
    other-modules:
        D.E.F
        Paths_package
    c-sources:
        cbits/foobar.c
        cbits/foobaz.c

If the module name can't be repeated this is possible, I mean not having two files with module MyModule () where on src/lib/MyModule.hs and src/exe/MyModule.hs on the same package at the same time.

On allSourcesBuildInfo we fail if we don't find the file that corresponds to a module name plus one of the known extensions, that's why we filter Paths_* and all the autogen-modules by ModuleName before converting them to a FilePath, without even knowing what search paths we have.

@ezyang

This comment has been minimized.

Copy link
Contributor

ezyang commented Aug 13, 2016

Ah, I think I understand you now, we are still talking about an autogen stanza. I think it is empirically the case that people define multiple components which include the same module name at different paths. At the very least, in a package with multiple executables you would expect Main.hs to show up multiple times. Even if this was not true, I don't see why it shouldn't be allowed: the modules of each component live in completely different namespaces and are given different symbol names.

Although it was not historically the case, these days I think it is best to think as a Cabal package as a simple wrapper around multiple components, which actually contain all the interesting bits. If something is declared at the top-level, outside of a stanza, it should be interpreted as meaning, "copy-paste me into each component stanza".

(Let me reregister my objection to an autogen stanza, as it is NOT a component like custom-setup, library, executable, etc are components! All other things equal, we ought to keep this symmetry, and autogenerated files is not a good enough reason. Don't let the tail wag the dog.)

On allSourcesBuildInfo we fail if we don't find the file that corresponds to a module name plus one of the known extensions, that's why we filter Paths_* and all the autogen-modules by ModuleName before converting them to a FilePath, without even knowing what search paths we have.

Yes, I misunderstood your proposal.

@fmaste

This comment has been minimized.

Copy link
Collaborator

fmaste commented Aug 13, 2016

At the very least, in a package with multiple executables you would expect Main.hs to show up multiple times.

main-is contains a FilePath, we even take out its extension and search for it with the usual ones and if not found we try with the given one. We are treating it like a main file, not a main module. I usually have a unique main folder were I put a bunch of modules all named module Main where but with different FilePath.

Even if this was not true, I don't see why it shouldn't be allowed: the modules of each component live in completely different namespaces and are given different symbol names.

What happens now if an executable depends on the package library and this same executable has a module name on its search path that clashes with one on the library ? Strange things happens on the edges, if we try to avoid same modules names on the hackage universe I could be welcome to enforce this on a single package or at least warn people about this with autogen. I don't see this as a stopper.

(Let me reregister my objection to an autogen stanza, as it is NOT a component like custom-setup, library, executable, etc are components! All other things equal, we ought to keep this symmetry, and autogenerated files is not a good enough reason. Don't let the tail wag the dog.)

It is a component if we treat it like one, by having cabal autogen, an Autogen.hs much like Setup.hs and a special hook like the one proposed with #3600 and having all of them on dist/autogen/ . If we are planning to have a strong support for autogen modules/files I don't see why not.

I don't say that this is the way to go but I don't like the other options, they seem hackish and difficult to treat as first class citizens of Cabal instead of complex hooks. This below looks wrong, I prefer my own hooks:

library
    exposed-modules:
        A.B.C
    other-modules:
        D.E.F
        Paths_package
    autogen-modules:
        A.B.C
        Paths_package
    c-sources:
        cbits/foobar.c
    autogen-c-sources:
        cbits/foobaz.c
    js-sources:
        jsbist/foobar.js
    autogen-js-sources:
        jsbits/foobaz.js

Now image this after adding cpp-sources, obj-c-sources, rust-sources and autogen-cpp-sources, autogen-obj-c-sources, autogen-rust-sources

@ezyang

This comment has been minimized.

Copy link
Contributor

ezyang commented Aug 13, 2016

main-is contains a FilePath, we even take out its extension and search for it with the usual ones and if not found we try with the given one.

You are right, I retract that as an example.

What happens now if an executable depends on the package library and this same executable has a module name on its search path that clashes with one on the library ?

So, we have to be careful here, because there are a few possible situations. We also have to be careful because the behavior changed recently (I think 1.24 has started doing the right behavior, but if you really care I can doublecheck)

  • The executable depends on the package library, executable's hs-source dirs is exe, library's hs-source dirs is lib, there is lib/A.hs and exe/A.hs. Result: it just works. lib/A.hs gets compiled with a unit id lib-0.1-abcd, and exe gets compiled with the unqualified package key, the modules don't have conflicting original names and can be linked together.
  • The executable doesn't depend on the package library, executable and library's hs-source dirs are the same, there is an A.hs which is exposed by lib and transitively depended upon by executable's Main.hs. In this case, the executable and library will literally rebuild A.hs under the different project settings; there is no sharing, they just happen to use the same source file (so you could have made a copy of A.hs and separate the source files). There will be TWO A.hi/A.o in the build directories; one for lib, one for exe.

I don't say that this is the way to go but I don't like the other options, they seem hackish and difficult to treat as first class citizens of Cabal instead of complex hooks. This below looks wrong, I prefer my own hooks:

How about this as an alternative:

library
    exposed-modules:
        A.B.C
    other-modules:
        D.E.F
        Paths_package
    autogen-modules:
        A.B.C
        Paths_package
    c-sources:
        cbits/foobar.c
    js-sources:
        jsbist/foobar.js
    autogen-files:
        cbits/foobar.c jsbits/foobaz.js

I mean, are you really going to have that many autogenerated files?

It is a component if we treat it like one, by having cabal autogen, an Autogen.hs much like Setup.hs and a special hook like the one proposed with #3600 and having all of them on dist/autogen/ . If we are planning to have a strong support for autogen modules/files I don't see why not.

That is an interesting suggestion, but it has some trouble. Suppose an autogenerated module wants to import a non-autogenerated module. Then you cannot just put it in the autogenerated stanza; you also have to move all of its dependencies into their own component (convenience library I guess) so that the autogenerated component can depend on them. I don't know how often autogenerated files actually depend on one non-autogenerated ones in this way, but it is something that is allowed in the current model.

@fmaste

This comment has been minimized.

Copy link
Collaborator

fmaste commented Aug 14, 2016

The executable depends on the package library, executable's hs-source dirs is exe, library's hs-source dirs is lib, there is lib/A.hs and exe/A.hs. Result: it just works. lib/A.hs gets compiled with a unit id lib-0.1-abcd, and exe gets compiled with the unqualified package key, the modules don't have conflicting original names and can be linked together.

When I do import A on the main-is of the executable which one is imported? The one from the library or from the executable? If there is no way to qualify an import by component we don't support same module names between components.
It's a misuse repeating module names on the same package, you can override the exposed-modules of the library (internal or external) and break things unexpectedly on big projects. Image defining a Prelude module instead of hiding it and loading Distribution.Compat.Prelude.
Not for this issue, but for the benefit of Cabal we should start assuming that every module is unique inside a package. Or at least treat it as "use at your own risk" feature, nothing that can't be solved with CPP or a simple if.

That is an interesting suggestion, but it has some trouble. Suppose an autogenerated module wants to import a non-autogenerated module. Then you cannot just put it in the autogenerated stanza; you also have to move all of its dependencies into their own component (convenience library I guess) so that the autogenerated component can depend on them. I don't know how often autogenerated files actually depend on one non-autogenerated ones in this way, but it is something that is allowed in the current model.

If it works now It will work if you always put the autogenerated modules besides the other-modules it imports, the dist/autogen directory will be part of GHC include dirs with the ones of the building stanza. If I have three executable that use module Utils and Utils imports other modules all of them have be repeated on other-modules of the three executables, having a module on other-modules without its import modules is an error autogenerated or not.

library
    exposed-modules:
        A.B.C
    other-modules:
        D.E.F
        Paths_package
    autogen-modules:
        A.B.C
        Paths_package
    c-sources:
        cbits/foobar.c
    js-sources:
        jsbist/foobar.js
    autogen-files:
        cbits/foobar.c jsbits/foobaz.js

I also like this alternative because it is faster to implement but not to explain, stills feels like mixing UI and business logic. This seems a discussion of convenience and style but I prefer the new stanza because:

  • Our most used case is Paths_* and is a global package module, easier to put just once on the new stanza.
  • No modules/files are repeated on the same stanza.
  • I prefer having a place to enumerate all the modules/sources excluded on sdist like extra-source-files includes.
  • I would prefer to add a new field to the PackageDescription data type that is ubiquitous as a parameter function, an autogen index instead of having to build one from every other stanza.
  • Easier to implement autogen modules with custom hooks because they have to modify only one place.
@ekmett

This comment has been minimized.

Copy link
Member

ekmett commented Aug 14, 2016

I rather like the autogen-files proposal, as it neatly covers "all things that aren't modules" -- if it works.

It might have problems how it figures out things when you have different paths for things for includes, c source directories, when you're dumping stuff in the dist/build/autogen dir by hand, etc. given that the autogen dir isn't in a known fixed path relative to the start of your project.

@ezyang

This comment has been minimized.

Copy link
Contributor

ezyang commented Aug 14, 2016

When I do import A on the main-is of the executable which one is imported?

The executable module is imported in all cases. The reason for this is that GHC has the following precedence rule for module imports: if a module is defined in the local package, it always has precedence when doing an import.

Image defining a Prelude module instead of hiding it and loading Distribution.Compat.Prelude.

Believe it or not, this is explicitly supported by GHC: if you define a module in the current package named Prelude, GHC will automatically preferentially use it over the actual Prelude.

Not for this issue, but for the benefit of Cabal we should start assuming that every module is unique inside a package.

While I am sympathetic to this perspective (global namespaces are convenient), we should not do this. Here's another reason why not to: Backpack's mechanism of mix-in linking encourages implementation convenience libraries to have the same module name (e.g. bytestring-impls and string-impls would both expose a module named Str which can be mix-in linked against a Str.hsig.) It would break a lot of code people might like to write to do it this way.

It will work if you always put the autogenerated modules besides the other-modules it imports

I mean, yes, you can the autogen stanza those semantics, and it will indeed work just the same as it does today; all I'm saying is that it is inconsistent with all other stanzas.

I think part of the confusion stems from a misunderstanding of how internal build-depends works (which I hopefully cleared up in the previous few answers.) When you build-depends on an internal library, the semantics is NOT that the contents of the library stanza are splatted into your current package. Rather, it is as if there were an actual libraries package containing only the library which you depend on.

Our most used case is Paths_* and is a global package module, easier to put just once on the new stanza.

No, no no, Paths_* is emphatically NOT a global package module, or, to be more precise, it's possible it used to be a global package module but I fixed it in HEAD so that it is per-component.

I would prefer to add a new field to the PackageDescription data type that is ubiquitous as a parameter function, an autogen index instead of having to build one from every other stanza.

Well, since autogen-files goes in BuildInfo there wouldn't be any duplication code side; BuildInfo is precisely the common data format for every type of component, but specified per component.

@ezyang

This comment has been minimized.

Copy link
Contributor

ezyang commented Aug 14, 2016

It might have problems how it figures out things when you have different paths for things for includes, c source directories, when you're dumping stuff in the dist/build/autogen dir by hand, etc. given that the autogen dir isn't in a known fixed path relative to the start of your project.

I think the idea is that you specify the unqualified path (you should never mention dist in a Cabal file) that occurred in your Cabal file, and Cabal should somehow take care of the rest.

@fmaste

This comment has been minimized.

Copy link
Collaborator

fmaste commented Aug 14, 2016

I rather like the autogen-files proposal, as it neatly covers "all things that aren't modules" -- if it works.

It should work, its just comparing strings. Thats why I don't understand you other comment

It might have problems how it figures out things when you have different paths for things for includes, c source directories, when you're dumping stuff in the dist/build/autogen dir by hand, etc. given that the autogen dir isn't in a known fixed path relative to the start of your project.

*-sources don't use include directories, they are just paths relative to the project folder and the autogen directory comes from BuildPath.autogenPackageModulesDir or BuildPath.autogenComponentModulesDir for the generator to decide which one to use.

@23Skidoo

This comment has been minimized.

Copy link
Member

23Skidoo commented Aug 14, 2016

I rather like the autogen-files proposal, as it neatly covers "all things that aren't modules" -- if it works.

I can live with autogen-files as well, though I prefer autogen-$LANG-sources.

@fmaste

This comment has been minimized.

Copy link
Collaborator

fmaste commented Aug 15, 2016

:) awesome discussion thread!

if you define a module in the current package named Prelude, GHC will automatically preferentially use it over the actual Prelude.

Haven't tried, I consider really bad practice modules with the same name on the same package because you are creating a new module with a different signature and making it impossible to use the older one. They are not the same and that means violating the only unique ID we have to reference own package modules on source code.
I'm starting to have nightmares of my spaghetti-php-days where editing haskell code I have to infer the search path to know which of the modules with the same name to change.

While I am sympathetic to this perspective (global namespaces are convenient), we should not do this. Here's another reason why not to: Backpack's mechanism of mix-in linking encourages implementation convenience libraries to have the same module name (e.g. bytestring-impls and string-impls would both expose a module named Str which can be mix-in linked against a Str.hsig.) It would break a lot of code people might like to write to do it this way.

I'm obviously much less than an Backpack expert so I'll just ask and learn, is there a need for autogenerated modules with the same name but different source on the same package? Isn't it the same as above, only if you want to be nasty?

No, no no, Paths_* is emphatically NOT a global package module, or, to be more precise, it's possible it used to be a global package module but I fixed it in HEAD so that it is per-component.

I'll check it with HEAD, at least with 1.24 its always on dist/build/autogen/. But be it on BuildPath.autogenPackageModulesDir or BuildPath.autogenComponentModulesDir its the same module name with the same exact source code. It's the generator responsibility to put the autogenerated modules on the correct place unless we create hooks that are called when the file is needed.

I mean, yes, you can the autogen stanza those semantics, and it will indeed work just the same as it does today; all I'm saying is that it is inconsistent with all other stanzas.
I think part of the confusion stems from a misunderstanding of how internal build-depends works (which I hopefully cleared up in the previous few answers.) When you build-depends on an internal library, the semantics is NOT that the contents of the library stanza are splatted into your current package. Rather, it is as if there were an actual libraries package containing only the library which you depend on.

I think we are talking about different things. The autogen stanza won't be like library and executable, it's like extra-source-files and data-files. A place to put the name of all the modules/sources that are autogenerated / not included on sdist / not preprocessed before configure, etc but needed to build components. Just that instead of being spread and repeated on every stanza we make a special place for then.

I still don't see a red flag on any of the two options and think it's just a matter of taste. Right now I'm loosing 2 to 1 and if nobody else speaks up I'll implement the autogen-*-sources field to my dislike (just the option, I don't dislike making the changes).

@ezyang

This comment has been minimized.

Copy link
Contributor

ezyang commented Aug 15, 2016

They are not the same and that means violating the only unique ID we have to reference own package modules on source code.

Your point is valid, though I will note that this is not completely true, as you can use package qualified imports to disambiguate (in some cases).

I'm starting to have nightmares of my spaghetti-php-days where editing haskell code I have to infer the search path to know which of the modules with the same name to change.

The rule is, first look for a module named this way in the current project. If none is present, look at the exports of all the packages you build-depends on.

I'm obviously much less than an Backpack expert so I'll just ask and learn, is there a need for autogenerated modules with the same name but different source on the same package? Isn't it the same as above, only if you want to be nasty?

In case you haven't looked at Backpack too closely yet, the Motivation/Overview section here https://github.com/ezyang/ghc-proposals/blob/backpack/proposals/0000-backpack.rst should set the setting.

The design pattern with Backpack I'm thinking of is this: imagine that you are making a package which is parametrized a large number of modules (imagine that it's parametrized over string representation, file path representation, monad, etc.) When using this parametrized package, a user could explicitly specify how each module should be implemented explicitly, but it would be far more convenient if the signatures and implementations have the same name; in this case, the user can just bring both the parametrized package and the implementations for the parameters into scope, and they magically "get put together".

If there are multiple implementations, evidently it's necessary for multiple packages to export the same module names. And if someone wants to use Backpack internally, then they want multiple libraries in the same package to export the same module names.

I know this smacks of record wildcards, and many people don't like this sort of implicitness. But what else are you going to do when there are 20+ module parameters to a package? This is a use-case we do want to support.

I'll check it with HEAD, at least with 1.24 its always on dist/build/autogen/.

Here's the relevant snippet from Distribution.Simple.Build:

writeAutogenFiles :: Verbosity
                  -> PackageDescription
                  -> LocalBuildInfo
                  -> ComponentLocalBuildInfo
                  -> IO ()
writeAutogenFiles verbosity pkg lbi clbi = do
  createDirectoryIfMissingVerbose verbosity True (autogenComponentModulesDir lbi clbi)

  let pathsModulePath = autogenComponentModulesDir lbi clbi
                 </> ModuleName.toFilePath (autogenPathsModuleName pkg) <.> "hs"

autogenComponentModulesDir means it gets put in a per component directory.

Just that instead of being spread and repeated on every stanza we make a special place for then.

Right. And so I guess I'm making two points, (1) extra-source-files/data-files are fields, so it makes sense that they behave this way, whereas your proposal is a stanza; there are no stanzas which behave this way, and (2) in most situations, the autogenerated files for each component are going to be different (if you're autogenerating some C code for the library, you're unlikely to autogen it AGAIN for the test suites, unless you're doing the "rebuild the library as part of the test suite hack", which is a hack and we shouldn't encourage). So it makes sense to specify it per component.

if nobody else speaks up I'll implement the autogen-*-sources field to my dislike

I'm OK with this!

@ezyang ezyang added this to the Cabal 2.0 milestone Aug 16, 2016

@ezyang ezyang modified the milestones: Cabal 2.0, Soon, Later Sep 6, 2016

@ezyang

This comment has been minimized.

Copy link
Contributor

ezyang commented Sep 6, 2016

This issue is subsumed by #3702

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment