New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support single value for attribute list in doc.to_array #1435

Merged
merged 7 commits into from Oct 20, 2017

Conversation

Projects
None yet
4 participants
@ramananbalakrishnan
Copy link
Contributor

ramananbalakrishnan commented Oct 18, 2017

Description

  • Updates Doc.to_array(attr_ids) to accept a single value in attr_ids
  • Partial fix for discussion in issue #1410.

Types of changes

  • New feature (non-breaking change adding functionality to spaCy)

Checklist:

  • My change requires a change to spaCy's documentation.
  • I have updated the documentation accordingly.
  • I have added tests to cover my changes.
  • All new and existing tests passed.
@ramananbalakrishnan

This comment has been minimized.

Copy link
Contributor

ramananbalakrishnan commented Oct 18, 2017

I am not a Cython expert, so could not get the reshape command working as-is. If there is a better to way to achieve the same result, do let me know.

Also, can I directly use intify_attrs to map a string name ("LEMMA") in the input list to the attribute id (spacy.attrs.lemma) ?

@explosion-bot

This comment has been minimized.

Copy link
Collaborator

explosion-bot commented Oct 18, 2017

Hi @ramananbalakrishnan, thanks for your pull request! 👍 It looks like you haven't filled in the spaCy Contributor Agreement (SCA) yet. The agrement ensures that we can use your contribution across the project. Once you've filled in the template, put it in the .github/contributors directory and add it to this pull request. Thanks a lot!

@ramananbalakrishnan

This comment has been minimized.

Copy link
Contributor

ramananbalakrishnan commented Oct 19, 2017

I have ported the changes over to the develop branch as well, for merging into v2. ( PR #1440 )
I should be able to keep track of suggestions from either PR and fix across both branches.

@honnibal

This comment has been minimized.

Copy link
Member

honnibal commented Oct 19, 2017

Thanks for this! It will be merged in time for the new v2 release.

@honnibal

This comment has been minimized.

Copy link
Member

honnibal commented Oct 19, 2017

Also, can I directly use intify_attrs to map a string name ("LEMMA") in the input list to the attribute id (spacy.attrs.lemma) ?

I think so. It's possible there's some quirks, but yes that's what we would want to be using.

@ines ines added v2 port and removed v2 port labels Oct 19, 2017

@ramananbalakrishnan ramananbalakrishnan force-pushed the ramananbalakrishnan:update_to_array branch from 1de7074 to 5941aa9 Oct 20, 2017

@ramananbalakrishnan

This comment has been minimized.

Copy link
Contributor

ramananbalakrishnan commented Oct 20, 2017

Done. to_array now supports string names in attr_ids.
Updated tests and the v2 PR as well.

@ramananbalakrishnan ramananbalakrishnan force-pushed the ramananbalakrishnan:update_to_array branch from 75674d2 to fbccc8c Oct 20, 2017

Make small changes to Doc.to_array
* Change type-check logic to 'hasattr' (Python type-checking is brittle)
* Small 'house style' edits, mostly making code more terse.
@honnibal

This comment has been minimized.

Copy link
Member

honnibal commented Oct 20, 2017

🎉

Thanks for this! Made some small changes, especially changing use of type in the conditionals. This is a little bit like the weird equality checking stuff in Javascript --- it's best to avoid type-based checks in Python if possible, because the language makes it hard to make the logic fully correct. hasattr checks are ugly, but it's a bit easier to cover all the cases this way.

@ramananbalakrishnan

This comment has been minimized.

Copy link
Contributor

ramananbalakrishnan commented Oct 20, 2017

got it! looks much cleaner now.

Also, since you are directly returning output.reshape(), looks like the output_1D variable is redundant now.

honnibal added some commits Oct 20, 2017

Fix compile error
Closures not allowed in cpdef

@ines ines merged commit 2a0ab6f into explosion:master Oct 20, 2017

2 checks passed

continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details

@ramananbalakrishnan ramananbalakrishnan deleted the ramananbalakrishnan:update_to_array branch Oct 20, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment