Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add tmap and ogdf #12498

Closed
wants to merge 52 commits into from
Closed

add tmap and ogdf #12498

wants to merge 52 commits into from

Conversation

hadim
Copy link
Member

@hadim hadim commented Aug 28, 2020

Checklist

  • Title of this PR is meaningful: e.g. "Adding my_nifty_package", not "updated meta.yaml"
  • License file is packaged (see here for an example)
  • Source is from official source
  • Package does not vendor other packages. (If a package uses the source of another package, they should be separate packages or the licenses of all packages need to be packaged)
  • If static libraries are linked in, the license of the static library is packaged.
  • Build number is 0
  • A tarball (url) rather than a repo (e.g. git_url) is used in your recipe (see here for more details)
  • GitHub users listed in the maintainer section have posted a comment confirming they are willing to be listed there

@conda-forge-linter
Copy link

Hi! This is the friendly automated conda-forge-linting service.

I wanted to let you know that I linted all conda-recipes in your PR (recipes/ogdf, recipes/tmap) and found some lint.

Here's what I've got...

For recipes/ogdf:

  • The recipe must have some tests.

For recipes/ogdf:

  • Recipe with the same name exists in bioconda: please discuss with @conda-forge/bioconda-recipes.

@hadim
Copy link
Member Author

hadim commented Aug 28, 2020

@conda-forge/help-c-cpp I am happy to make this work on windows but this is not blocking for me. Can you give some help?

The error is:

Call Stack (most recent call first):
  CMakeLists.txt:30 (include)


NMAKE : fatal error U1065: invalid option 'm'
Stop.

See https://dev.azure.com/conda-forge/feedstock-builds/_build/results?buildId=202273&view=logs&j=240f1fee-52bc-5498-a14a-8361bde76ba0&t=23010c01-6a9d-5f94-9037-792618644b77

@hadim
Copy link
Member Author

hadim commented Aug 28, 2020

ready for review

@hadim
Copy link
Member Author

hadim commented Aug 28, 2020

I can also disable windows if this is tricky to debug. Let me know.

@scopatz
Copy link
Member

scopatz commented Aug 29, 2020

Yeah, it is basically up to you whether you want to debug it here or in the feedstock

@hadim
Copy link
Member Author

hadim commented Aug 29, 2020

Yeah, it is basically up to you whether you want to debug it here or in the feedstock

Definitively not here. I was only asking in case somebody has an easy, obvious and quick fix.

Good to review/merge for me.

@hadim
Copy link
Member Author

hadim commented Aug 29, 2020

@scopatz It's all green, expect for the test of ogdf. It's a C++ only lib so I am not sure this is worth adding tests. If I still must add some, which kind should I add?

recipes/tmap/meta.yaml Outdated Show resolved Hide resolved
recipes/tmap/meta.yaml Outdated Show resolved Hide resolved
@hadim
Copy link
Member Author

hadim commented Aug 31, 2020

@scopatz let me know whether there is more to do here.

@conda-forge-linter
Copy link

Hi! This is the friendly automated conda-forge-linting service.

I just wanted to let you know that I linted all conda-recipes in your PR (recipes/ogdf, recipes/tmap) and found it was in an excellent condition.

Copy link
Member

@mbargull mbargull left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for moving OGDF to conda-forge!

recipes/ogdf/meta.yaml Show resolved Hide resolved
recipes/ogdf/meta.yaml Outdated Show resolved Hide resolved
recipes/tmap/meta.yaml Outdated Show resolved Hide resolved
recipes/tmap/meta.yaml Outdated Show resolved Hide resolved
recipes/tmap/meta.yaml Outdated Show resolved Hide resolved
recipes/tmap/meta.yaml Outdated Show resolved Hide resolved
recipes/tmap/meta.yaml Show resolved Hide resolved
recipes/tmap/meta.yaml Show resolved Hide resolved
recipes/tmap/build.sh Show resolved Hide resolved
hadim and others added 3 commits August 31, 2020 12:43
Co-authored-by: Marcel Bargull <mbargull@users.noreply.github.com>
Co-authored-by: Marcel Bargull <mbargull@users.noreply.github.com>
Co-authored-by: Marcel Bargull <mbargull@users.noreply.github.com>
@hadim
Copy link
Member Author

hadim commented Sep 1, 2020

done already!

@hadim
Copy link
Member Author

hadim commented Sep 1, 2020

Almost done!

It looks like there is a nasty bug only on osx (py38):

tmap/tests/test_layout.py::TestLayout::test_lf_layout PASSED             [ 11%]
tmap/tests/test_lsh_forest.py::TestLSHForest::test_init PASSED           [ 22%]
tmap/tests/test_lsh_forest.py::TestLSHForest::test_query PASSED          [ 33%]
tmap/tests/test_lsh_forest.py::TestLSHForest::test_knn_graph PASSED      [ 44%]
tmap/tests/test_minhash.py::TestMinhash::test_init PASSED                [ 55%]
tmap/tests/test_minhash.py::TestMinhash::test_from_binary_array PASSED   [ 66%]
tmap/tests/test_minhash.py::TestMinhash::test_from_sparse_binary_array PASSED [ 77%]
tmap/tests/test_minhash.py::TestMinhash::test_from_string_array PASSED   [ 88%]
tmap/tests/test_minhash.py::TestMinhash::test_from_weight_array FAILED   [100%]

=================================== FAILURES ===================================
______________________ TestMinhash.test_from_weight_array ______________________

self = <test_minhash.TestMinhash object at 0x7ffbffbbe280>

    def test_from_weight_array(self):
        mh = tm.Minhash(8, 42, 64)
        a = mh.from_weight_array(tm.VectorFloat([0.2, 0.6, 0.22, 0.26, 0.62, 0.66]))
        b = mh.from_weight_array(tm.VectorFloat([0.26, 0.6, 0.22, 0.26, 0.62, 1.0]))
        assert len(a) == 128
>       assert round(mh.get_weighted_distance(a, b), 3) == 0.094
E       assert 0.062 == 0.094
E         +0.062
E         -0.094

tmap/tests/test_minhash.py:36: AssertionError

Full error: https://dev.azure.com/conda-forge/feedstock-builds/_build/results?buildId=203585&view=logs&j=e35d4f76-8ff2-5536-d795-df91e63eb9f7&t=fa7b4b17-b6ff-5c9c-8cfc-15f888c92310&l=4985

The failing test is:

    def test_from_weight_array(self):
        mh = tm.Minhash(8, 42, 64)
        a = mh.from_weight_array(tm.VectorFloat([0.2, 0.6, 0.22, 0.26, 0.62, 0.66]))
        b = mh.from_weight_array(tm.VectorFloat([0.26, 0.6, 0.22, 0.26, 0.62, 1.0]))
        assert len(a) == 128
        assert round(mh.get_weighted_distance(a, b), 3) == 0.094

and the C++ code corresponding to get_weighted_distance():

float
tmap::Minhash::GetWeightedDistance(std::vector<uint32_t>& vec_a,
                                   std::vector<uint32_t>& vec_b)
{
  float intersect = 0;
  size_t length = vec_a.size();

  for (unsigned int i = 0; i < length; i += 2)
    if (vec_a[i] == vec_b[i] && vec_a[i + 1] == vec_b[i + 1])
      intersect++;

  return 1.0f - 2.0f * intersect / length;
}

Any cross-platform C++ expert around to help debug this?

(I don't have an OSX machine so I cant debug this locally)

@scopatz
Copy link
Member

scopatz commented Sep 1, 2020

@hadim - you should probably skip those tests here, and (if you are feeling nice) you can report them as bugs upstream

@hadim
Copy link
Member Author

hadim commented Sep 1, 2020

@scopatz yeah, I thought about it but I am not very comfortable releasing a package with a failing test (even if it's on OSX only :-D ). I will wait a bit and see whether someone has an idea and in the meantime, I'll try to find a way to debug this.

- LICENSE_minisat.txt
- LICENSE_pugixml.txt
- LICENSE_snowhouse.txt
- LICENSE_tinydir.txt
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this really include all of these other packages?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like COIN_EXTERNAL_SOLVER_LIBRARIES can be used to point to the conda-forge version of coin: https://github.com/ogdf/ogdf/blob/master/doc/build.md There are probably similar variables for the other packages

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this really include all of these other packages?

It corresponds to all the licenses I have found in https://github.com/ogdf/ogdf/blob/master/LICENSE.txt. If there is more to include please let me know.

Copy link
Member Author

@hadim hadim Sep 1, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like COIN_EXTERNAL_SOLVER_LIBRARIES can be used to point to the conda-forge version of coin: https://github.com/ogdf/ogdf/blob/master/doc/build.md There are probably similar variables for the other packages

So you would like to package all those deps in conda forge or use the already packaged ones and use them to build ogdf?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So you would like to package all those deps in conda forge or use the already packaged ones and use them to build ogdf?

Yep, exactly. This is for a couple of reasons, but the main ones are:

  1. Consistent ABIs across all packages in an environment (eg only one copy of COIN) so that we know that the packages will work together
  2. Use of shared libraries keep total size of all packages down because we aren't including a copy of COIN or zlib et cetra everytime another package uses it as a dependency
  3. It pushes some of these license issues to the user's machine rather than having to worry about re-licensing here

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally, we want to avoid having conda packages vendor other package's SOs. This can cause environments that conflict so both packages would have the same SO's.

@wolfv has been working on the COIN build and so he should comment more.

Copy link
Member Author

@hadim hadim Sep 1, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I totally understand the motivation here but this is a significant amount of work and I don't have the bandwidth for this at the moment. I am sorry for that because I know it took you time to review this PR.

We can let this open and I'll try to get back to it at some point (probably not soon) or someone else can take the lead.

We can also merge as it is and let someone else tackle this in the feedstock directly.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I totally get that! It is a ton of work. Thanks so much @hadim! Hopefully someone with more time can come along and pick this up

@wolfv
Copy link
Member

wolfv commented Sep 4, 2020

@hadim I looked at the code & issue and I am not sure exactly what's going on. The code looks quite straight-forward. It looks more like an issue in tmap and not with conda-forge to me.

@beckermr
Copy link
Member

beckermr commented Sep 4, 2020

The main issues here @wolfv were that tmap is bundling pybind11 and COIN. We were hoping that you might have the build of COIN done?

@wolfv
Copy link
Member

wolfv commented Sep 4, 2020

I have just pushed a couple of improvements to the coin packaging, and I have figured out how to do the run_constrains so that coincbc and coin-or-cbc cannot be installed at the same time.

@beckermr
Copy link
Member

beckermr commented Sep 4, 2020

thanks @wolfv! It looks like to merge thus one then we need to make sure it depends on the conda-forge build of COIN and link it to that one.

@hadim
Copy link
Member Author

hadim commented Sep 4, 2020

thanks for looking into it @wolfv. The custom tmap channel (https://anaconda.org/tmap/tmap) has a build for osx so it should be possible to build against osx. The recipe is here https://github.com/reymond-group/tmap/tree/master/ogdf-conda/recipe

I don't know whether this is related but they have a conda_build_config.yaml file with:

CONDA_BUILD_SYSROOT:
  - /opt/MacOSX10.13.sdk [osx]

Could it be an issue related to the osx SDK?

@beckermr
Copy link
Member

beckermr commented Sep 4, 2020

See our documentation for how to use newer OS X sdks.

@hadim
Copy link
Member Author

hadim commented Sep 4, 2020

See our documentation for how to use newer OS X sdks.

I thought the latest one was used by default. Let's see whether it works!

@beckermr
Copy link
Member

beckermr commented Sep 4, 2020

We use 10.9 by default.

@hadim
Copy link
Member Author

hadim commented Sep 4, 2020

Same error.

@hadim
Copy link
Member Author

hadim commented Sep 4, 2020

A quick summary about what's need to be done to get this merged.

The following libs must be packaged in conda-forge:

Note that some of those libs are only used for testing so they probably could be removed from the above list.

For those already packaged, the version must be checked in order to be compatible with ogdf.

Another important task is to fix ogdf upstream by modifying the building logic in order to use the conda libraries instead of the embeded ones (see https://github.com/ogdf/ogdf/tree/master/src/ogdf/lib). This is probably not an easy and quick task.

For tmap, I think everything is fine (but blocked by ogdf).

Hope someone will find the time for this at some point.

@scopatz
Copy link
Member

scopatz commented Sep 8, 2020

I think you could put in a separate PR for tmap, if you wanted

@hadim hadim changed the title add tmap add tmap and ogdf Dec 8, 2020
@stale
Copy link

stale bot commented May 8, 2021

Hi friend!

We really, really, really appreciate that you have taken the time to make a PR on conda-forge/staged-recipes! conda-forge only exists because people like you donate their time to build and maintain conda recipes for use by the community.

In an effort to maintain this repository and increase the signal-to-noise for open PRs, the maintainers of staged-recipes close excessively old PRs after six months. This PR will remain open for another month, and then will be closed.

If you'd like to keep it open, please comment/push and we will be happy to oblige! Note that very old PRs will likely need to be rebased on master so that they can be rebuilt with the most recent CI scripts. If you have any trouble, or we missed reviewing this PR in the first place (sorry!), feel free to ping the team using a special command in a comment on the PR to get the attention of the staged-recipes team.

Cheers and thank you for contributing to this community effort!

@stale stale bot added the stale will be closed in 30 days label May 8, 2021
@stale
Copy link

stale bot commented Jun 8, 2021

Hi again! About a month ago, we commented on this PR saying it would be closed in another month if it was still inactive. It has been a month and so now it is being closed. Thank you so much for making it in the first place and contributing to the community project that is conda-forge. If you'd like to reopen this PR, please feel free to do so at any time!

Cheers and have a great day!

@stale stale bot closed this Jun 8, 2021
@N-Coder
Copy link

N-Coder commented Dec 28, 2021

Wow, thanks for putting all this effort into this! Having a working version of the OGDF installable via conda would be awesome, especially now that we also have python bindings available. Here's my current take on the status from one year ago. The following libraries are by now already available on conda-forge:

I was able to remove the source code for these three libraries from my copy of the OGDF and then use cmake to dynamically link against precompiled versions of them installed from conda-forge. So these three should pose no problem any more and I can take care of also upstreaming these changes to the OGDF.

The following libraries are only used for testing, so there's no runtime dependency here and thus also no need to package them for dynamic linking at runtime:

  • bandit
  • snowhouse
  • tinydir

This leaves us with abacus and minisat that are needed at runtime, but currently not available on conda-forge. Abacus seems to be inactive, with its GitHub repo archived, and there seems to be no development and no new releases since 2012. It in general seems to be superseded (within the Coin project, and in general) by Coin Clp. Minisat also has no real activity since 2013. So, both projects seem to be inactive for close to 10 years, while I also failed to make out any noticeable current userbase for both projects. Thus, I don't see a big benefit from spending more time on packaging them individually for conda-forge. My suggestion for preventing any runtime problems (i.e. name clashes) would be to put their code within the OGDF into a different namespace, as OGDF is using a modified version of Abacus anyways. Also the other issues mentioned here should only be minor for these two, as parallel usage of different version of both projects would then no longer be an issue (both are also used mostly internally within OGDF algorithms and not really intended to be exported), both are compiled in statically (so no shared object files), both libraries are pretty small, and the "licensing problem" is that of the OGDF, as the vendoring is done by every build of the OGDF.

If these suggestions would be acceptable to you (@scopatz @beckermr) I can open a new PR based on this one together with my changes outlined above for bringing in the OGDF part (if that's okay for @hadim). I'd leave the tmap point open as I have no clue about that library.

I have one further question for the conda-forge maintainers: the OGDF usually comes with a debug and release build that behave quite different when it comes to error checking etc. The recommended build for usage with ogdf-python is the debug build, as that also handles errors more gracefully and usually detects them early instead of segfaulting - which is very usefull when you are currently playing around with the algorithms e.g. in a Jupyter Notebook. I guess for tmap the recommended build would be the release one as users probably only care about the runtime of the algorithms that are used in this case, not how easy it is to fidget with them. Furthermore, I'm not quite sure how portable the release build is, as some parts of the OGDF use CPU specific functionality (which is probably also the reason why compiling on the target machine is the recommended way of installation for maximum efficiency). So I'm not quite sure how these two modes should be handled? Should we leave the ogdf package as release build and ensure that it uses no processor-specific functionality and is compiled with -mtune=native -march=generic and make a separate ogdf-dev package available (that should use none of these optimizations)? Anybody who wants more speed would then need to compile the optimzied release version theirselves as I'm not aware of conda packages having a build step on the target system like pip source distributions, right?

@hadim
Copy link
Member Author

hadim commented Dec 28, 2021

Thanks for the detailed update. I would be happy if you want to take the lead on the ogdf package.

On the dep things, as long as the licenses are the same I think it's fine and fair to have them embedded. But conda forge folks might comment further on this.

On the different builds (dev, cpu specific etc) I will let the conda forge folks to comment but be aware that a single conda recipe can output multiple build/packages as needed.

For tmap, I can take care of it once ogdf is merged (this will be quite straightforward.

Feel free to open a new PR based on this one and xref this PR for context + ping us on it.

@wolfv
Copy link
Member

wolfv commented Jan 20, 2022

I think your plan sounds good. Also agree with hadim on all points.

@N-Coder
Copy link

N-Coder commented Jan 20, 2022

@wolfv thanks a lot for your reply, I'll prepare a PR for these changes! Do you know who could help us with the debug/release/target architecture optimized build problem mention in the last paragraph?

@hadim hadim deleted the tmap branch February 20, 2023 02:34
@hadim hadim mentioned this pull request Feb 20, 2023
10 tasks
@hadim
Copy link
Member Author

hadim commented Feb 20, 2023

I have opened a new PR for ogdf only: #22085

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stale will be closed in 30 days
Development

Successfully merging this pull request may close these issues.

None yet

8 participants