Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Module relocation #1181

Merged
merged 25 commits into from Apr 30, 2015
Merged

Module relocation #1181

merged 25 commits into from Apr 30, 2015

Conversation

@curita
Copy link
Member

@curita curita commented Apr 21, 2015

Relocations listed in #1063

@curita curita force-pushed the curita:module-relocation branch 2 times, most recently from 00fccb0 to ce5a725 Apr 21, 2015
@@ -64,7 +64,7 @@ Does Scrapy work with HTTP proxies?

Yes. Support for HTTP proxies is provided (since Scrapy 0.8) through the HTTP
Proxy downloader middleware. See
:class:`~scrapy.contrib.downloadermiddleware.httpproxy.HttpProxyMiddleware`.
:class:`~scrapy.downloadermiddlewares.httpproxy.HttpProxyMiddleware`.

This comment has been minimized.

@kmike

kmike Apr 21, 2015
Member

'Middleware' sound plural to me even without 's' in the end. We need an English native speaker here :)

This comment has been minimized.

@curita

curita Apr 21, 2015
Author Member

Yeah, sounds weird, but we generally use 'middleware' to refer to a single middleware class, and 'middlewares' when we're talking of several of these classes/components in the docs. I think we're using that term in a more concrete sense, by being equal to 'middleware class'.

I'll keep the same convention for spider "middlewares" until we decide what to do with this, so we can amend the docs along with it.

This comment has been minimized.

@curita

curita Apr 21, 2015
Author Member

I forgot, it's used with a trailing 's' in settings too, we have DOWNLOADER_MIDDLEWARES and SPIDER_MIDDLEWARES. I think I prefer to keep it as "middlewares", seems like its usage is consistent in all places where is referred.

@curita curita force-pushed the curita:module-relocation branch 2 times, most recently from d1e2987 to b20fea5 Apr 23, 2015
@curita
Copy link
Member Author

@curita curita commented Apr 23, 2015

I've finished with all relocations agreed in #1063

Some remaining issues that could be handled in this PR:

  • Contributing to Scrapy entry in the docs still talks about "Scrapy contrib": https://github.com/Curita/scrapy/blob/module-relocation/docs/contributing.rst#scrapy-contrib. Should we remove it, or keep it and still promote to submit extensions to scrapy/contrib? Right now there isn't any example for extensions there since the whole scrapy/contrib is gone.
  • Some modules that I thought they shouldn't be renamed to their plurals:
    • scrapy/resolver.py
    • scrapy/loader/: ItemLoader is a single class, with many processors
    • scrapy/extensions/feedexport.py: FeedExport is a single class, with many storages
    • scrapy/utils/defer.py
    • scrapy/utils/signal.py
    • scrapy/utils/test.py
    • scrapy/utils/spider.py
  • There are some files containing common code from packages, which could be moved into those packages:
    • scrapy/command.py: Could be moved inside scrapy/commands/
    • scrapy/linkextractor.py: Could be moved inside scrapy/linkextractors
  • Some managers not named as such (with is inconsistent with scrapy/signalmanager.py):
    • scrapy/extension.py -> scrapy/extensionmanager.py.
    • scrapy/middleware.py -> scrapy/middlewaremanager.py
    • scrapy/core/downloader/middleware.py ->scrapy/core/downloader/middlewaremanager.py
  • Some shortened module names that could be extended:
    • scrapy/statscols.py -> scrapy/statscollectors.py
    • scrapy/squeues.py -> scrapy/serializablequeues.py
    • scrapy/cmdline.py -> scrapy/commandline.py
    • scrapy/core/spidermw.py -> scrapy/core/spidermiddlewaremanager.py: Related to last issue
    • scrapy/utils/reqser.py -> scrapy/utils/requestserializ{e,ation}.py
    • scrapy/utils/testproc.py -> scrapy/utils/testprocess.py
@curita curita changed the title [WIP] Module relocation Module relocation Apr 23, 2015
@nyov
Copy link
Contributor

@nyov nyov commented Apr 28, 2015

+1
Nice work.

I agree command.py could fit into commands/__init__.py, and linkextractor.py into linkextractors/__init__.py.
Also think extending the manager module names makes sense.
But I'm not sure about the other module names, keeping them short is okay for me (but I would rename spidermw.py to spidermwm.py in that case).
Except statscollectors.py, that looks good to me (because cols is short for columns in my head), but the ones beneath util/ are just utilities and squeues, cmdline make sense to me as they are.

About the docs, I think any mention of contrib in contributing can get lost.
But for confused users upgrading, maybe put a deprecation notice elsewhere (upgrading infos?), akin to this:

Scrapy Contrib

Scrapy contrib shared a similar rationale as Django contrib. It has now been removed in favor of cleaner code boundaries and splitting features into sub-packages. All code therein has been moved into scrapy itself or new packages.

@curita
Copy link
Member Author

@curita curita commented Apr 28, 2015

@nyov Thanks for the review! :)

I fully agree with your comments, I didn't mind the shortened filenames except for statscols.py, when I renamed it to plural it sounded weirder. I'm going to start implementing those changes in separated commits so we can revert them if we decide otherwise.

@curita
Copy link
Member Author

@curita curita commented Apr 28, 2015

There's a MiddlewareManager that I haven't found earlier: scrapy/pipelines/__init__.py. Not sure what to do with it, I think it makes sense to leave it there, but we could move scrapy/extension.py (ExtensionManager) to scrapy/extensions/__init__.py to follow the same convention.

I think both scrapy/core/downloader/middleware.py and scrapy/core/spidermw.py are fine where they are, and scrapy/middleware.py needs to be kept at top level since it's used by the other MiddlewareManagers.

@nyov
Copy link
Contributor

@nyov nyov commented Apr 28, 2015

I noticed the same earlier (extension.py) but thought nothing of it. But I agree, if it fits.
In that case, what about putting spider.py into spiders/__init__.py?
Maybe that's a special case, can't say if it would be a good move here.

@curita
Copy link
Member Author

@curita curita commented Apr 28, 2015

Now that I think of scrapy/core/downloader/middleware.py (DownloaderMiddlewareManager) could be moved to scrapy/downloadermiddlewares/__init__.py and scrapy/core/spidermw.py (SpiderMiddlewareManager) to scrapy/spidermiddlewares/__init__.py, I like that all managers are going to be in a consistent place that way.

I like moving spider.py to spiders/__init__.py, but yes, that's maybe something to be more careful with.

@nyov
Copy link
Contributor

@nyov nyov commented Apr 28, 2015

I can't say if it's okay to move them out of 'core' (depends on the meaning of "core" here; maybe the interfaces shouldn't be mixed with the implementations here - if that's what they are).

For spider.py I had in mind what the "spiders in other languages" working group might say to that. If spider.py is a python-only spider implementation (not an interface), I'd move it. But if it'll become something important for other language spiders, it should stay. May be too early to think about that, though.

@curita curita force-pushed the curita:module-relocation branch from b20fea5 to 62191de Apr 30, 2015
@curita
Copy link
Member Author

@curita curita commented Apr 30, 2015

I rebased master after the python logging merge. I'm merging this PR as discussed in last meeting, and I'll open a new one for the pending relocations mentioned.

curita added a commit that referenced this pull request Apr 30, 2015
Module relocation
@curita curita merged commit 5155162 into scrapy:master Apr 30, 2015
1 check passed
1 check passed
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@curita curita deleted the curita:module-relocation branch Apr 30, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

3 participants