Improve client to use it via IDE #54

vshlapakov · 2017-03-14T09:38:28Z

The main goal of the PR are the following:

provide return types everywhere where it could be useful,
hint kwargs in methods to clarify which params can be passed,
maybe work on style improvements also, i.e. params naming.

I'm also reviewing all the current docstrings and fixing any inconsistencies.

vshlapakov · 2017-03-15T15:14:30Z

scrapinghub/client/activity.py

    Not a public constructor: use :class:`Project` instance to get a
    :class:`Activity` instance. See :attr:`Project.activity` attribute.

-    Please note that list() method can use a lot of memory and for a large


The lines are moved to _Proxy.list method to avoid copy-pasting it everywhere.

The problem with moving this to a parent class is that it becomes invisible when you check help or docs for the method. E.g. I think we should build documentation based on docstrings later on, this line should be in each list method that can cause such issues.

vshlapakov · 2017-03-15T15:18:37Z

@chekunkov Initial work is done, let me know if there's anything could be improved, I'm afraid that I has lost freshness of vision and can miss something important. Most of the changes in the PR are related with indentation, renaming params and fixing docstrings.

chekunkov · 2017-03-16T11:49:04Z

scrapinghub/client/__init__.py


-    def __init__(self, auth=None, dash_endpoint=None, **kwargs):
+    # FIXME not sure it's worth to keep all HS kwargs here, most of them
+    # are appliable only to oldy Hubstorage client


I'd keep only necessary kwards here and link a HubstorageClient docs for more info

Well, let's keep only necessary kwargs then

chekunkov · 2017-03-16T11:50:26Z

scrapinghub/client/__init__.py

+            connection_timeout=None,
+            max_retries=None,
+            max_retry_time=None,
+            user_agent=None,


I think we can revert that as I metioned above, but if for some reason you decide to keep it - note that you need to forward input args here, not use None everywhere.

chekunkov · 2017-03-16T11:57:05Z

scrapinghub/client/collections.py

            setattr(self, method, wrapped)

-    def list(self, *args, **kwargs):
+    def list(self, requests_params=None, **params):


api docs mention much more parameters, I think they are applicable here https://doc.scrapinghub.com/api/collections.html#collections-project-id-type-collection

Oh, nice, I'll add it 👍

chekunkov · 2017-03-16T12:06:22Z

scrapinghub/client/items.py

-        Returns:
-            dict: updated set of params
+        :return: a dict with updated set of params.
+        :rtype: dict.


please remove the dot

chekunkov · 2017-03-16T12:07:18Z

scrapinghub/client/jobs.py

-        :param \*\*params: (optional) a set of filters to apply when counting
-            jobs (e.g. spider, state, has_tag, lacks_tag, startts and endts).
        :return: jobs count.
+        :rtype: int.


I don't think rtype value should have a trailing dot, please check across the repo

chekunkov · 2017-03-16T12:11:22Z

scrapinghub/client/jobs.py

+                         ('has_tag', has_tag), ('lacks_tag', lacks_tag),
+                         ('startts', startts), ('endts', endts),
+                         ('meta', meta)]
+        params.update({k: v for k, v in filter_kwargs if v is not None})


possibly you can generalize filtering? something like

update_kwargs(kwargs, spider=spider, state=state, count=count, spidername=spider_name)

note the last one, you can use map to different name if necessary

Sounds good, will do, thanks

chekunkov · 2017-03-16T12:13:55Z

scrapinghub/client/jobs.py

+        return Job(self._client, str(job_key))

-    def summary(self, _queuename=None, **params):
+    # FIXME there must me spider_name instead of spider_id for consistency


spider_name or spider? list takes spider, schedule takes spider_name :) or maybe above methods should be filtered by spider_id for consistency with summary and iter_last?

Right, so spider or spider_id then..I'd vote for spider, name is easier to remember

chekunkov · 2017-03-16T12:17:15Z

scrapinghub/client/jobs.py

    :meth:`ScrapinghubClient.get_job` and :meth:`Jobs.get` methods.

-    :ivar projectid: in integer project id.
+    :ivar project_id: in integer project id.


"in" is not needed here

chekunkov · 2017-03-16T12:22:40Z

scrapinghub/client/jobs.py

    Not a public constructor: use :class:`Job` instance to get a
    :class:`Jobmeta` instance. See :attr:`Job.metadata` attribute.

    Usage::


I'm a bit confused, why Usage is followed by ::? When rst renders this will produce a literal block, I don't think we want literal block for the whole usage description, I expect that it's a list of short descriptions followed by a literal block with example:

Usage: - get job metadata instance:: >>> job.metadata <scrapinghub.client.jobs.JobMeta at 0x10494f198>

You're right, I'll fix it everywhere 👌

chekunkov · 2017-03-16T12:25:26Z

scrapinghub/client/utils.py

+        - each string defines:
+            a successor method name to proxy 1:1 with origin method
+        - each tuple should consist of 2 strings:
+            a successor method name and an origin method name


Why literal block ::?

Add return types in docstrings Add hints for kwargs, fix docstrings Use update_kwargs helper to unify logic Rename spider_args -> job_args Unify spider param for different methods Don't return count from job.update_tags

vshlapakov · 2017-03-17T09:40:38Z

Tests are fixed, merging.

vshlapakov self-assigned this Mar 14, 2017

vshlapakov commented Mar 15, 2017

View reviewed changes

chekunkov suggested changes Mar 16, 2017

View reviewed changes

chekunkov approved these changes Mar 16, 2017

View reviewed changes

Improve docstrings, add kwargs hints, unification

b6c86ff

Add return types in docstrings Add hints for kwargs, fix docstrings Use update_kwargs helper to unify logic Rename spider_args -> job_args Unify spider param for different methods Don't return count from job.update_tags

vshlapakov force-pushed the sc1467-1-ide branch from aa270a1 to b6c86ff Compare March 17, 2017 09:18

Minor README fix for legacy client

5656825

vshlapakov merged commit 7137456 into sc1467-1 Mar 17, 2017

vshlapakov deleted the sc1467-1-ide branch March 17, 2017 09:42

Improve client to use it via IDE #54

Improve client to use it via IDE #54

Uh oh!

Conversation

vshlapakov commented Mar 14, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

vshlapakov commented Mar 15, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

vshlapakov commented Mar 17, 2017

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

vshlapakov commented Mar 14, 2017 •

edited

Loading

vshlapakov commented Mar 15, 2017 •

edited

Loading