Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Late includes #1650

Closed
wants to merge 1 commit into from
Closed

Late includes #1650

wants to merge 1 commit into from

Conversation

jdemeyer
Copy link
Contributor

@jdemeyer jdemeyer commented Mar 29, 2017

This pull request adds support for a late #include. My proposed syntax is to add a leading pipe to the filename:

cdef extern from "|spam.h":
    ...

For this cdef extern block, Cython will generate the #include statement after its own variable declarations. So it can be used if the C code in spam.h needs to refer to Cython variables.

cysignals has a use-case which is currently "solved" by some ugly hack involving .pxi files (for which I need #483). If this pull request is accepted, cysignals would no longer need to rely on .pxi files: sagemath/cysignals#49

Of course, there are plenty of bikeshedding opportunities for the syntax, but I care about the feature, not the syntax. I chose the leading pipe because it is unlikely to occur in actual filenames. (Also: think of the pipe as "piping" the Cython variables into the header file).

jdemeyer added a commit to sagemath/cysignals that referenced this pull request Mar 29, 2017
@robertwb
Copy link
Contributor

I'm in favor of the concept, not so sure about the syntax. One change (which would be more invasive) would be to order the includes and declarations in the same order that they occur in the file, e.g.

cdef int foo()
cdef extern from "whatever.h":
    [something that uses foo]
[more declarations]

This also has pros and cons, but might be more flexible.

jdemeyer added a commit to sagemath/cysignals that referenced this pull request Mar 30, 2017
@jdemeyer
Copy link
Contributor Author

I'm in favor of the concept, not so sure about the syntax.

Like I said, I care about the feature, not the syntax. Maybe a new keyword like

cdef extern from "spam.h" late:
    ...

but that will require changes to the parser.

One change (which would be more invasive) would be to order the includes and declarations in the same order that they occur in the file. This also has pros and cons, but might be more flexible.

Two obvious "cons" are that it will be not so easy to implement (certainly beyond what I could do) and that it has potential to break existing code.

jdemeyer added a commit to sagemath/cysignals that referenced this pull request Apr 5, 2017
jdemeyer added a commit to sagemath/cysignals that referenced this pull request Apr 5, 2017
@scoder
Copy link
Contributor

scoder commented Apr 9, 2017

Is there a clear reason why the includes must come before the declarations by default? Meaning, does this solve a new use case in addition to an existing one, or is there really just one use case that was solved incorrectly before?

@jdemeyer
Copy link
Contributor Author

jdemeyer commented Apr 9, 2017

Is there a clear reason why the includes must come before the declarations by default?

Obviously yes: if the header file defines a new type that you want to use in Cython. This is a very common use case.

@robertwb
Copy link
Contributor

On the other hand, needing to include code after Cython's declarations is not a common usecase. I'd like to see at least one more example where it'd be useful.

(I'd also like more intuitive syntax, but that's a smaller issue.)

@jdemeyer
Copy link
Contributor Author

I'd like to see at least one more example where it'd be useful.

So you want me to invent a new use before you want to merge this PR? I guess that some cases where currently a separate C file is added to the sources (i.e. something like Extension(sources=["foo.pyx", "foo_helper.c"]) could be replaced by late includes, but it's hard to substantiate that (and then one could argue whether late includes are a better solution).

In general, "late includes" are needed whenever some C code needs access to variables defined by Cython. I think that's a reasonable thing to do, even if it's uncommon.

@jdemeyer
Copy link
Contributor Author

There is another use case at https://trac.sagemath.org/ticket/21459#comment:22

@embray
Copy link
Contributor

embray commented Apr 28, 2017

I like this idea and can definitely see where it would be useful.

As to the bikeshed, I like the additional syntax cdef extern from "spam.h" late: but I don't know how much trouble it adds to require modifying the parser for this.

@jdemeyer
Copy link
Contributor Author

I ran into this again while working on https://trac.sagemath.org/ticket/23022 (I have now solved that ticket differently which doesn't require this PR).

In general, this issue occurs when implementing functions in C using cdef extern from "spam.c" and you want the C code in spam.c to access functions or variables defined by Cython. Maybe this is not a common use case, but I still think that it's a reasonable use case which should be supported.

@embray
Copy link
Contributor

embray commented May 19, 2017

Is there a specific reason for the existing default behavior? That is, is there any reason the include couldn't just come after the Cython defines by default?

@jdemeyer
Copy link
Contributor Author

That is, is there any reason the include couldn't just come after the Cython defines by default?

The same question has been asked above and answered in #1650 (comment)

@embray
Copy link
Contributor

embray commented May 19, 2017

Oops, sorry. Got it.

@jdemeyer
Copy link
Contributor Author

Allow me to ask for another ping on this PR. This is genuinely useful...

@scoder
Copy link
Contributor

scoder commented Sep 26, 2017

I'm not convinced that this being "genuinely useful". Even cdef extern from "spam.c" is not a very common use case. It seems to me that the obvious way to do this is to export something as public from the Cython module, include the generated header file from the external C code, and then link both together into the final extension module. If you care about inlining etc., enable LTO.

@jdemeyer
Copy link
Contributor Author

Even cdef extern from "spam.c" is not a very common use case.

I think it's not so relevant whether it is common or not. The question is whether is it sensible and useful. And maybe it is not so common because it is hard to make it work currently. People use what is documented and what works.

It seems to me that the obvious way to do this is to export something as public from the Cython module, include the generated header file from the external C code, and then link both together into the final extension module.

I personally dislike that solution because it is more complicated: it requires 2 manually-written C files (a file with the implementation to be added as extra source file and a file with the declarations to be used with cdef extern from).

Also keep in mind that (in my use cases at least), the C files make heavy use of macros: if it would be plain functions, I might as well implement them in Cython directly. This might make it even more complicated to untangle in a .c and a .h file.

Anyway, if the official Cython recommendation for situations like these is to use # distutils: sources = foo.c and cdef extern from "foo.h", then at least say so and close this PR as "wontfix".

@scoder
Copy link
Contributor

scoder commented Sep 27, 2017

it is more complicated

If there is something we can do to ease that setup, I'd be happy to hear about it. It's definitely more general than what this pull request gives. Improvements to the docs are also always welcome.

if the official Cython recommendation for situations like these is to use # distutils: sources = foo.c and cdef extern from "foo.h", then at least say so and close this PR as "wontfix".

Plus public declarations in the Cython module to generate a header file, yes. I think that's what people should do. It's the normal C way of dealing with this. Whether you then really build and link both as separate .o files, or still include the .c file directly as extern from, is up to you. The important thing is that the names/symbols that the external C code uses become explicit.

Thank you for the insightful discussion and your continuous efforts. I really appreciate it.

@scoder scoder closed this Sep 27, 2017
@jdemeyer
Copy link
Contributor Author

I think that's what people should do.

I don't agree but at least it's better than keeping a PR open forever. I hope that this implies that Cython will add support for this two-file way of compiling in cases where it doesn't properly work. For example, I already see one potential issue: if you want this to work with distributed .pxd files (i.e. in site-packages), you might need to do something analogous to #1654 with the # distutils: sources files.

still include the .c file directly as extern from, is up to you.

The whole point of this PR is that this doesn't always work. At least it does not work if the .c file needs access to Cython variables/functions.

@scoder
Copy link
Contributor

scoder commented Sep 27, 2017

it does not work if the .c file needs access to Cython variables/functions.

It has access to them, if you declare them as public. Cython will then generate a header file that defines them, which you can include in your .c file. But that probably means that they will be declared as extern in that file. Never mind... Back to linking then.

if you want this to work with distributed .pxd files (i.e. in site-packages), you might need to do something analogous to #1654 with the # distutils: sources files.

This is best handled with a properly exported C-API, and there is support for that via the normal .pxd cimport mechanism.

@jdemeyer
Copy link
Contributor Author

This is best handled with a properly exported C-API

Please elaborate, I have no idea what you mean with this.

My scenario would be as follows: I have a package.pxd file which is stored in .../site-packages/mypackage/package.pxd. Now, this package.pxd file also requires some functionality which is implemented in C. By the ongoing discussion, this would mean that package.pxd would look like

# distutils: sources = package_helper.c

cdef extern from "package_helper.h":
    ...

Both package_helper.c and package_helper.h would be stored in .../site-packages/mypackage.

PR #1654 dealt with automatically finding the package_helper.h file when compiling some external Cython module using package.pxd. So, I'm guessing (but I haven't checked) that something analogous will need to be done to automatically find the package_helper.c file.

@scoder
Copy link
Contributor

scoder commented Sep 27, 2017

this package.pxd file also requires some functionality which is implemented in C

... in which case it would not not be a .pxd file but an extension module (.pyx), which would then export a C-API via its .pxd file.

@pkittenis
Copy link
Contributor

I think it means that compiling .c files and linking them is done at install time and those files do not need to be stored in site-packages unlike .pxd definitions used as cython header files.

For the above example, the setup.py would have an Extension with package_helper.c as one of its sources. Setuptools compiles package_helper.c and links its .o with the cython extension so its symbols are available.

The resulting .so gets installed along with .pxd definitions. The .c file need not be installed in python site-packages as it's only used at compile/link time.

IMO the above functionality would be akin to a circular import in regular python modules. Module A uses function from module B, module B uses function from module A. The solution is to import neither but use a third module to 'link' them so that A and B import from C and not each other. The same applies in C.

When the C code is actually cython generated it gets more complicated but trying to make circular imports 'work' is prone to other issues. What happens when the .c files are compiled in parallel for example?

@jdemeyer
Copy link
Contributor Author

... in which case it would not not be a .pxd file but an extension module (.pyx), which would then export a C-API via its .pxd file.

No, I really mean what I wrote: a .pxd file implementing some functionality in C. This is what cysignals does for example and it is one of the motivating reasons for this pull request.

@jdemeyer
Copy link
Contributor Author

Setuptools compiles package_helper.c and links its .o with the cython extension so its symbols are available.

Can setuptools compile a dynamic library? I don't think so, but feel free to prove me wrong...

@scoder
Copy link
Contributor

scoder commented Sep 27, 2017

Can setuptools compile a dynamic library?

Yes.

a .pxd file implementing some functionality in C

Ok, but then we're back to the normal case of a .pxd file that declares extern code from a .h file (or .c, if you prefer). Why should that not work? You fixed it in #1654, right?

If it's a separately installed package, you really cannot expect the code over there to be able to get access to your own code in your local package. That seems like a really troublesome setup with way too tight coupling between packages.

@pkittenis
Copy link
Contributor

pkittenis commented Sep 27, 2017

Can setuptools compile a dynamic library? I don't think so, but feel free to prove me wrong...

Can do better than that, here's a real world example :)

@jdemeyer
Copy link
Contributor Author

Yes.

Care to elaborate? At least provide a link to documentation or sources?

I found these two SO posts but they don't have an answer either:

  1. https://stackoverflow.com/questions/19701208/setuptools-how-to-build-shared-library-before-package-installation
  2. https://stackoverflow.com/questions/37316177/distribute-a-python-package-with-a-compiled-dynamic-shared-library

Both have answers, but they are about extension modules, not dynamic libraries.

Can do better than that, here's a real world example :)

Again, that's an extension module, not a dynamic library.

@jdemeyer
Copy link
Contributor Author

Ok, but then we're back to the normal case of a .pxd file that declares extern code from a .h file (or .c, if you prefer). Why should that not work?

Exactly because of this PR that you closed! That works if Cython code wants to access the C code, but not if the C code wants to access the Cython code.

@pkittenis
Copy link
Contributor

Are you referring to compiling system libraries via setuptools? Then no, but why should it? It only needs access to the .h for importing and the .so for linking. The .h can be distributed in site-packages so it is available for cimport and the .so can be distributed as part of a binary package (wheel/multiple system packages).

In the case of wheels, there is functionality to correctly install dependent shared libraries so they are bundled together (auditwheel and delocate for linux/osx respectively). That's really a link time job rather than a compile time job, IMO.

@jdemeyer
Copy link
Contributor Author

Are you referring to compiling system libraries via setuptools? Then no, but why should it?

I thought that is what you meant with your answer, but it seems that I misunderstood. Anyway, I know how to create extension modules with Cython, but that seems besides the point.

@jdemeyer
Copy link
Contributor Author

Let me elaborate on the dynamic library vs. extension module: with dynamic linking, the symbols (variables/functions) are automatically there and can be used by C code without any special action. With extension modules, you can import stuff but that requires an explicit action. And I wouldn't know how to do that: as far as I know, Cython has no way to define extra code in the .pxd file which should be executed whenever the module is loaded (i.e. in the PyMODINIT_FUNC).

So the option that I was pursuing in this PR was to let Cython do the import in the usual way. But currently, I have no way to access those imports from C code.

@scoder
Copy link
Contributor

scoder commented Sep 27, 2017

Can setuptools compile a dynamic library?

Example: https://github.com/scoder/lupa/blob/lupa-1.5/setup.py#L179

@scoder
Copy link
Contributor

scoder commented Sep 27, 2017

I think we should move this discussion to the cython-users mailing list. The issue tracker (and especially a pull request) is the wrong place to discuss this. I closed this PR because I think that the problem it solves is way to narrow and specific to merit being a feature. There are other ways to solve the general problem.

@jdemeyer
Copy link
Contributor Author

There are other ways to solve the general problem.

I'm happy with one least one way :-) I'll write to cython-users, please answer there.

@jdemeyer
Copy link
Contributor Author

Since most opposition from @robertwb to this PR was about the syntax, maybe I could salvage it by not using any explicit syntax. Instead, depending on the contents of the cdef extern from block, Cython could guess whether it's best to put the #include before or after the Cython-generated declarations. For example, #includes which define types should come early, other #includes can come late.

@jdemeyer
Copy link
Contributor Author

New attempt at #1896

@scoder
Copy link
Contributor

scoder commented Jan 21, 2018

This is now tracked in #2079.

kliem pushed a commit to sagemath/memory_allocator that referenced this pull request May 27, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants