Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Organize imports & dependencies #15

Merged
merged 5 commits into from May 3, 2015

Conversation

Projects
None yet
2 participants
@c-w
Copy link
Contributor

commented May 2, 2015

Some changes to make the project more readily accessible:

  • Remove unused imports and make import style more standard.
  • Prune unused dependencies from requirements.txt and make required versions more lenient.
  • Git-ignore virtualenv directory for people managing dependencies in that way.

c-w added some commits May 2, 2015

Make import style more standard
Imports should be on separate lines.

Imports should be grouped with the order being most generic to least generic:
- standard library imports
- third-party imports
- application-specific imports

Be consistent in use of `from X import Y` versus `import X`.

Source: https://google-styleguide.googlecode.com/svn/trunk/pyguide.html
Remove unused dependencies
If we compare the output of the bash function `list_imports` defined below to
the contents of requirements.txt, we see that we have a bunch of required
modules that are actually never imported... We can get rid of these modules to
make the initial setup of the project easier.

```
list_imports() {
    grep -E '^(import\s.*|from\s.*import)' *.py \
    | cut -f2 -d':' \
    | cut -f2 -d' ' \
    | cut -f1 -d'.' \
    | sort -u
}
```
Be more lenient on the dependencies
This project isn't doing anything particularly fancy meaning that we don't need
to depend on the bleeding edge versions of our requirements i.e. we can use the
slightly older versions of scikit-learn, pandas and scipy that are available
pre-compiled in the package repositories of many Linux distributions. This makes
the project easier to set up and therefore more approachable.
@VikParuchuri

This comment has been minimized.

Copy link
Member

commented May 2, 2015

Thanks, will merge later today or tomorrow. (on phone now)
On May 2, 2015 1:47 PM, "Clemens Wolff" notifications@github.com wrote:

c-w wants to merge 5 commits into dataquestio:master from c-w:master:

Some changes to make the project more readily accessible:

  • Remove unused imports and make import style more standard.
  • Prune unused dependencies from requirements.txt and make required
    versions more lenient.
  • Git-ignore virtualenv directory for people managing dependencies in
    that way.

You can view, comment on, or merge this pull request online at:

#15
Commit Summary

  • Remove unused imports
  • Make import style more standard
  • Remove unused dependencies
  • Be more lenient on the dependencies
  • Git-ignore virtualenv directory

File Changes

Patch Links:


Reply to this email directly or view it on GitHub
#15.

import pandas as pd
from sklearn.feature_extraction.text import CountVectorizer, TfidfTransformer

from pandas import DataFrame

This comment has been minimized.

Copy link
@VikParuchuri

VikParuchuri May 3, 2015

Member

Are you using the from imports for external imports, and import MODULE_NAME for internal modules? Is this a recommended style? (not questioning, just curious)

This comment has been minimized.

Copy link
@c-w

c-w May 3, 2015

Author Contributor

A rule of thumb that I've seen thrown about here and there and that makes sense to me is to try to stick to import X for standard library imports (familiarity, name-spacing) and to use from X import Y for third party code which tends to have lots of nested modules (e.g. SciKit) and where classes are generally speaking already name-spaced (e.g. DataFrame is very recognizably from Pandas).

VikParuchuri added a commit that referenced this pull request May 3, 2015

Merge pull request #15 from c-w/master
Organize imports & dependencies

@VikParuchuri VikParuchuri merged commit fc99a57 into dataquestio:master May 3, 2015

1 check passed

dq 0.9436166346837191 RMSE
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.