Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bbknn and leiden functions #361

Merged
merged 8 commits into from
Nov 16, 2018
Merged

bbknn and leiden functions #361

merged 8 commits into from
Nov 16, 2018

Conversation

ktpolanski
Copy link
Contributor

  • created bbknn.py and leiden.py
  • added Park18 and Traag18 references
  • added bbknn import to api/pp.py and leiden import to api/tl.py

@ktpolanski
Copy link
Contributor Author

Think the Travis thing failing is because of the new dependencies

Copy link
Member

@flying-sheep flying-sheep left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure is E ImportError: No module named 'leidenalg'

You should add a new extra in setup.py:

https://github.com/theislab/scanpy/blob/04056e4798873ecec71adcc51804825245a55295/setup.py#L26

then install it in .travis.yml (pip install -e .[louvain,leiden,test] or so) and add a test.

scanpy/tools/leiden.py Outdated Show resolved Hide resolved
@flying-sheep
Copy link
Member

The tests don’t fail, but you should still add the extra to setup.py

@ktpolanski
Copy link
Contributor Author

The tests don’t fail, but you should still add the extra to setup.py

Is the fact that you don't list 'docs' in your pip thing (pip install -e .[louvain,leiden,test]) purposeful?

@flying-sheep
Copy link
Member

Yes. We don’t build the docs, we just check the readme. so we only need docutils and not sphinx and so on.

@ktpolanski
Copy link
Contributor Author

Ok, I've added the thing to setup.py, and did the pip command with all non-docs extras, but apparently the travis file didn't change.

@flying-sheep
Copy link
Member

Currently there are no tests, so those packages aren't actually needed. Looks good to me!

@ktpolanski
Copy link
Contributor Author

...so, what's the procedure with this? Do I need to do anything else?

Copy link
Member

@flying-sheep flying-sheep left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Almost done! Sorry for the delay

@@ -0,0 +1,70 @@
def bbknn(adata, copy=False, **kwargs):
'''
Batch balanced KNN, altering the KNN procedure to identify each cell's top neighbours in
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be simplified I think. The whole file could just be:

try:
    from bbknn import bbknn
except ImportError:
    def bbknn(*args, **kwargs):
        raise ImportError('Please install BBKNN: `pip3 install bbknn`')

Except if we want control over the docstring? But as long as the upstream docstring is good, why bother?

Also the file is tab-indented while the rest of scanpy is 4-space-indented. Please adapt it.

partition_type = None,
copy = False,
**partition_kwargs):
'''
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use """

from .. import logging as logg

def leiden(
adata,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you please add typing here instead of putting it in the docstring? we’re moving to typehints:

def leiden(
    adata: AnnData,
    key_added: str = 'leiden',
    ...
    adjacency: Optional[sparse.spmatrix] = None,
    ...

And in the docs just:

Parameters
----------
adata
    The annotated data matrix.
key_added
    ``adata.obs`` key under which to add the cluster labels.
...

import leidenalg
logg.info('running Leiden clustering', r=True)
adata = adata.copy() if copy else adata
#are we clustering a user-provided graph or the default AnnData one?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use PEP8 style # comment, not #comment

@ktpolanski
Copy link
Contributor Author

I tried to do the things. I have no idea how to type the arguments, so there may be some trainwrecks there. Funny how it wasn't like that in the louvain function I based this on.

@ktpolanski
Copy link
Contributor Author

So apparently this now fails on the typing, on a line you provided as an example. Great. Can I please revert to how it was?

@flying-sheep
Copy link
Member

Hi! Sorry for frustrating you :( if you want I can fix and merge it manually. You're doing great work!

You simply need to import the things you're using in the annotations, then it'll work!

@ktpolanski
Copy link
Contributor Author

This will introduce a new problem, won't it? I'm using some sort of thingy from leiden

@ktpolanski
Copy link
Contributor Author

Ok, I'm stumped. Help please.

@flying-sheep
Copy link
Member

just leave that one parameter’s annotation out for now, and I’ll pull it 😉

@ktpolanski
Copy link
Contributor Author

Ok, did that. Sorry about the trouble, that was my first time trying to type stuff and I was a bit in the dark on this. Thanks for your patience.

@flying-sheep flying-sheep merged commit 9fa4c0f into scverse:master Nov 16, 2018
@flying-sheep
Copy link
Member

done. thank you!

@flying-sheep
Copy link
Member

flying-sheep commented Nov 16, 2018

I had to fix a few issues, if you want you can check it out: 66e64b4

  • I unified the parameter order with louvain

  • I actually import it in sc.tl

  • I added it to the docs here: https://scanpy.readthedocs.io/en/latest/api/#clustering-and-trajectory-inference

  • I added a test

  • I fixed the references (you had typos there: 2018 instead of 18 and a missing “L”)

  • You did this:

    partition_kwargs['weights'] = None
    if use_weights:
        weights = np.array(g.es['weight']).astype(np.float64)
    # “weights” is never used then

    But I assume you meant this. Am I correct?

    if use_weights:
        partition_kwargs['weights'] = np.array(g.es['weight']).astype(np.float64)

@ktpolanski
Copy link
Contributor Author

Good catch, sorry! I added the whole partition_kwargs dictionary late into the process and forgot to change that syntax.

In that case, could you also "hook up" pp.bbknn?

@flying-sheep
Copy link
Member

Good call, will do!

Koncopd pushed a commit that referenced this pull request Nov 16, 2018
@falexwolf
Copy link
Member

Thank you, @flying-sheep for fielding all the questions! Thank you for the PR, @ktpolanski! Is there anything still unclear where I can be of help?

@flying-sheep
Copy link
Member

Well, the bbknn docs look like this: https://scanpy.readthedocs.io/en/latest/api/index.html#batch-effect-correction

I went that way since I didn’t want to make it look like we coded it (with the docs hosted on our page and so on). Do you think that’s a good solution or would you like it to be done differently?

@ktpolanski
Copy link
Contributor Author

I mirrored the scanpy docstring style to the best of my ability, but for some reason it turned out a little bit different visually than the ones you have. Hope that's not too much of a problem.

@flying-sheep
Copy link
Member

Surely not! And since we’re linking it anyway, it doesn’t need to match ours.

I like that you’re using ``code`` instead of `code`, that looks much better than the way many many docstrings in scanpy do.

I suggest that you add some lines to your intersphinx mapping and use more links instead of code though, that makes the docs easier to use, since people can just click on things:

https://github.com/theislab/scanpy/blob/7b97d0d734970527230cf7f25ab15df874a143b3/docs/conf.py#L60-L69

I also added this hack to make links to AnnData and csr_matrix work:

https://github.com/theislab/scanpy/blob/7b97d0d734970527230cf7f25ab15df874a143b3/docs/conf.py#L236-L253

@falexwolf
Copy link
Member

try:
    from bbknn import bbknn
except ImportError:
    def bbknn(*args, **kwargs):
        raise ImportError('Please install BBKNN: `pip3 install bbknn`')

I went that way since I didn’t want to make it look like we coded it (with the docs hosted on our page and so on). Do you think that’s a good solution or would you like it to be done differently?

This is great!

https://scanpy.readthedocs.io/en/latest/api/scanpy.api.pp.bbknn.html#scanpy.api.pp.bbknn

But I actually don't see any docs there, I don't know why it doesn't find the original docstring... We'd like to have the reference to @ktpolanski preprint in the docstring in the first line together with a short summary of what it does and how it does it, just as for any other function. As this directly uses the implementation from @ktpolanski, we'd also want an explicit statement about that.

pp.bbknn is just an alias for bbknn.bbknn(). Refer to it for the documentation.

... can be removed. That should be evident...

In the case of the tl.leiden wrapper, I'd also add an explicit statement that this wraps the leiden package of Traag (2018).

Otherwise, all of this is great!

@flying-sheep
Copy link
Member

flying-sheep commented Nov 19, 2018

But I actually don't see any docs there, I don't know why it doesn't find the original docstring...

That’s because on readthedocs, bbknn isn’t installed, so it uses the dummy version.

We'd like to have the reference to @ktpolanski preprint in the docstring in the first line together with a short summary of what it does and how it does it, just as for any other function.

We could do that either by adding a docstring to the dummy version that links to https://bbknn.rtfd.io or by installing bbknn on rtd, and modifying the docstring programmatically. something like:

try:
    from bbknn import bbknn
    first_para, rest = bbknn.__doc__.split('\n\n', 1)
    bbknn.__doc__ =
        '{}\n\nFor a graphical explanation, visit '
        '`The bbknn project <https://github.com/Teichlab/bbknn>`__-\n\n{}'
        .format(first_para, rest)
except ImportError:
    ...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants