Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[backend] Add Zulip Backend #667

Closed
wants to merge 1 commit into from
Closed

Conversation

vchrombie
Copy link
Member

Fixes #630

@vchrombie
Copy link
Member Author

The PR is not completed yet and needs a lot of improvements. I am still figuring out how to use offset based method (I have a doubt if it is applicable 🤔).

Some of the channels where we can test this are

  1. https://python.zulipchat.com/
  2. https://anitab-org.zulipchat.com/

The usage of this backend will be

perceval zulip 'https://python.zulipchat.com/' 'importlib' -e 'BOT_EMAIL_ADDRESS' -t 'BOT_API_KEY'

You can generate the BOT_EMAIL_ADDRESS and BOT_API_KEY from the Zulip chat itself (you need to join the chat).
Click on the ⚙️ >> Settings >> Your bots >> Add a new bot

I have made a small script that fetches a fixed number of messages, not in an iterative manner.
https://gist.github.com/vchrombie/bfd6d967e03bbe28cf4241995fb4b91b

@valeriocos
Copy link
Member

Hi @vchrombie , sorry for the late reply on this PR! I'm on it

@valeriocos valeriocos self-requested a review May 23, 2020 10:08
Copy link
Member

@valeriocos valeriocos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @vchrombie , the PR looks promising! I left some comments and code suggestions which allow to run the backend with the info you provided here .

I understand that the support for pagination is missing, please give it a try and let me know if you find any blocker. For the next iteration, please consider to add also the docstring descriptions.

Thanks for your time!

perceval/backends/core/zulip.py Outdated Show resolved Hide resolved
perceval/backends/core/zulip.py Outdated Show resolved Hide resolved
perceval/backends/core/zulip.py Outdated Show resolved Hide resolved
perceval/backends/core/zulip.py Show resolved Hide resolved
perceval/backends/core/zulip.py Outdated Show resolved Hide resolved
perceval/backends/core/zulip.py Show resolved Hide resolved
perceval/backends/core/zulip.py Outdated Show resolved Hide resolved
perceval/backends/core/zulip.py Outdated Show resolved Hide resolved
perceval/backends/core/zulip.py Outdated Show resolved Hide resolved
perceval/backends/core/zulip.py Show resolved Hide resolved
@vchrombie
Copy link
Member Author

Thanks for the suggestions and pointers @valeriocos.

I understand that the support for pagination is missing, please give it a try and let me know if you find any blocker.

I have a logic in my mind. Once I fix this issue #667 (review comment), I will head to implement it and let you know about it. 😃

@valeriocos valeriocos self-requested a review May 24, 2020 08:44
@valeriocos
Copy link
Member

valeriocos commented May 24, 2020

Hi @vchrombie , I guess yesterday I make a mistake when copying the diffs, since the backend was working for me. From your current code you should apply the following diff:

diff --git a/perceval/backends/core/zulip.py b/perceval/backends/core/zulip.py
index 4d61a5a..0bf5b09 100644
--- a/perceval/backends/core/zulip.py
+++ b/perceval/backends/core/zulip.py
@@ -160,6 +160,15 @@ class Zulip(Backend):
         ts = float(item['timestamp'])
         return ts
 
+    @staticmethod
+    def metadata_category(item):
+        """Extracts the category from a Zulip item.
+
+        This backend only generates one type of item which is
+        'message'.
+        """
+        return CATEGORY_MESSAGE
+
     def _init_client(self, from_archive=False):
         """Init client"""
 
@@ -191,19 +200,6 @@ class ZulipClient(HttpClient):
 
         super().__init__(url, archive=archive, from_archive=from_archive, ssl_verify=ssl_verify)
 
-    def fetch(self, url, payload=None, headers=None, auth=None):
-        """Fetch the data from a given URL.
-
-        :param url: link to the resource
-        :param payload: payload of the request
-        :param headers: headers of the request
-        :param auth: auth of the request
-
-        :return: a response object
-        """
-        response = super().fetch(url, payload)
-        return response
-
     def get_messages(self, anchor):
         """Fetch the messages."""

This is the output I'm getting:

/home/slimbook/Escritorio/sources/venv/bin/python3 /home/slimbook/Escritorio/sources/perceval/bin/perceval zulip https://python.zulipchat.com/ importlib -e zulip_perceval-bot@zulipchat.com -t XNurdLPB65zy969F8mmxbbgKAcyG2gIE
[2020-05-24 10:46:46,417] - Sir Perceval is on his quest.
{
    "backend_name": "Zulip",
    "backend_version": "0.1.0",
    "category": "message",
    "classified_fields_filtered": null,
    "data": {
        "avatar_url": "https://zulip-avatars.s3.amazonaws.com/1000/1cd87892343b6620726b112851f0b66cbda4a68f?x=x&version=4",
        "client": "Internal",
        "content": "Welcome to #**importlib**.",
        "content_type": "text/x-markdown",
        "display_recipient": "importlib",
        "flags": [
            "read",
            "historical"
        ],
        "id": 159280404,
        "is_me_message": false,
        "reactions": [],
        "recipient_id": 303900,
        "sender_email": "welcome-bot@zulip.com",
        "sender_full_name": "Welcome Bot",
        "sender_id": 100007,
        "sender_realm_str": "zulipcore",
        "sender_short_name": "welcome-bot",
        "stream_id": 187177,
        "subject": "hello",
        "submessages": [],
        "timestamp": 1551022959,
        "topic_links": [],
        "type": "stream"
    },
    "origin": "https://python.zulipchat.com/importlib",
    "perceval_version": "0.13.0",
    "search_fields": {
        "item_id": "159280404"
    },
    "tag": "https://python.zulipchat.com/importlib",
    "timestamp": 1590310008.240044,
    "updated_on": 1551022959.0,
    "uuid": "dba374094af66e532b4e3af07c8db760761811f8"
}
{
    "backend_name": "Zulip",
    "backend_version": "0.1.0",
    "category": "message",
    "classified_fields_filtered": null,
    "data": {
        "avatar_url": "https://secure.gravatar.com/avatar/163acaf31cfd01645493404a2d379df6?d=identicon&version=1",
        "client": "ZulipElectron",
        "content": "@**Steven Ma** Try this in a `Dockerfile`:\n```\nFROM ubuntu:bionic\n\nRUN apt update\nRUN apt upgrade -y\nRUN apt install -y python3 python3-dev python3-venv python3-pip\nRUN python3 -m pip install -U pip tox pip-run\n```",
        "content_type": "text/x-markdown",
        "display_recipient": "importlib",
        "flags": [
            "read",
            "historical"
        ],
        "id": 159281179,
        "is_me_message": false,
        "reactions": [],
        "recipient_id": 303900,
        "sender_email": "jaraco@jaraco.com",
        "sender_full_name": "Jason R. Coombs",
        "sender_id": 113001,
        "sender_realm_str": "python",
        "sender_short_name": "jaraco",
        "stream_id": 187177,
        "subject": "Testing on Linux in docker",
        "submessages": [],
        "timestamp": 1551024348,
        "topic_links": [],
        "type": "stream"
    },
    "origin": "https://python.zulipchat.com/importlib",
    "perceval_version": "0.13.0",
    "search_fields": {
        "item_id": "159281179"
    },
    "tag": "https://python.zulipchat.com/importlib",
    "timestamp": 1590310008.241469,
    "updated_on": 1551024348.0,
    "uuid": "2e6a37c4b312e5cfbce05fbacdc02ecb3ecb673a"
}
[2020-05-24 10:46:48,242] - Summary of results

	   Total items: 	2
	Items produced: 	2
	 Items skipped: 	0

	Last item UUID: 	2e6a37c4b312e5cfbce05fbacdc02ecb3ecb673a
	Last item date: 	2019-02-24 16:05:48+00:00

	Min. item date: 	2019-02-24 15:42:39+00:00
	Max. item date: 	2019-02-24 16:05:48+00:00

	Min. offset: 	-	Max. offset: 	-	Last offset: 	-


[2020-05-24 10:46:48,246] - Sir Perceval completed his quest.

Process finished with exit code 0

@valeriocos valeriocos removed their request for review May 24, 2020 08:51
@vchrombie
Copy link
Member Author

Thanks for the help @valeriocos. I got it right. 😃

Also, we have only item_id in the search_fields

    "search_fields": {
        "item_id": "159281179"
    },

I was thinking of adding stream too

    "search_fields": {
        "item_id": "159281179",
        "stream": "importlib"
    },

Also, do you think any more fields would go good ?

@valeriocos
Copy link
Member

You're welcome @vchrombie ! :)

I was thinking of adding stream too

That's a good idea!

Also, do you think any more fields would go good ?

The idea of the search_fields is to simplify the searches by avoiding to inspect what is inside the data attribute. For the moment the id and stream seem enough, maybe in the future we could add the flags or the topic_links if we see the contain valuable info.

@coveralls
Copy link

coveralls commented May 24, 2020

Coverage Status

Coverage increased (+0.03%) to 97.025% when pulling dc581e0 on vchrombie:zulip into 8fd7bea on chaoss:master.

@vchrombie
Copy link
Member Author

vchrombie commented May 24, 2020

Hi @valeriocos
I updated the PR according to your suggestions and it worked. I implemented the chain fetching too.

I have tested the backend and here are the results.

  • python.zulipchat.com, stream: importlib
[2020-05-24 16:53:48,804] - Summary of results

	   Total items: 	12
	Items produced: 	12
	 Items skipped: 	0

	Last item UUID: 	a836795bdaf7a49c49f05424e4875651d059432c
	Last item date: 	2019-09-10 00:53:55+00:00

	Min. item date: 	2019-02-24 15:42:39+00:00
	Max. item date: 	2019-09-10 00:53:55+00:00

	Min. offset: 	-	Max. offset: 	-	Last offset: 	-


[2020-05-24 16:53:48,805] - Sir Perceval completed his quest.
  • python.zulipchat.com, stream: core/help
[2020-05-24 15:53:49,517] - Summary of results

	   Total items: 	820
	Items produced: 	820
	 Items skipped: 	0

	Last item UUID: 	77fc711abe7ca9738d7c496b1efbcdf455b5e249
	Last item date: 	2020-04-27 01:38:46+00:00

	Min. item date: 	2018-04-05 00:49:12+00:00
	Max. item date: 	2020-04-27 01:38:46+00:00

	Min. offset: 	-	Max. offset: 	-	Last offset: 	-


[2020-05-24 15:53:49,518] - Sir Perceval completed his quest.
  • anitab-org.zulipchat.com, stream: design
[2020-05-24 16:14:53,232] - Summary of results

	   Total items: 	1632
	Items produced: 	1632
	 Items skipped: 	0

	Last item UUID: 	c28b3fbdb777be873cc1934e73bbb603d0edaa58
	Last item date: 	2020-05-24 10:17:43+00:00

	Min. item date: 	2019-12-06 10:13:27+00:00
	Max. item date: 	2020-05-24 10:17:43+00:00

	Min. offset: 	-	Max. offset: 	-	Last offset: 	-


[2020-05-24 16:14:53,234] - Sir Perceval completed his quest.

There were no errors while fetching the chats.

I started working on adding the tests and other things need to get this PR completed.
Please let me know if you have any more suggestions.

Once it is done, I will squash the commits and fix the signoff. 😬

Thanks.

@valeriocos
Copy link
Member

Thank you for the update @vchrombie . Please ping me when the PR is ready for review.

@vchrombie
Copy link
Member Author

Hi @valeriocos
I have worked on the tests but they are not completed.

There are still some things pending. I have faced some errors. I will check on this by tomorrow.
Please leave any suggestions, if you have any when you are free.

@vchrombie vchrombie force-pushed the zulip branch 2 times, most recently from 88b5402 to f7d3c83 Compare May 25, 2020 05:19
Copy link
Member

@valeriocos valeriocos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @vchrombie , I had a look at the PR and it is in a good shape even if not finished. I left some comments, please remember to complete the docstrings and tests.

If this can be of any help, you can run the coverage locally to make sure all code is covered, :
captura_439

Thanks

tests/test_zulip.py Outdated Show resolved Hide resolved
tests/test_zulip.py Outdated Show resolved Hide resolved
tests/test_zulip.py Outdated Show resolved Hide resolved
tests/test_zulip.py Outdated Show resolved Hide resolved
perceval/backends/core/zulip.py Outdated Show resolved Hide resolved
perceval/backends/core/zulip.py Show resolved Hide resolved
perceval/backends/core/zulip.py Show resolved Hide resolved
@vchrombie
Copy link
Member Author

Thanks for the review @valeriocos.

please remember to complete the docstrings and tests.

I planned to do it once I fix all the tests. I will fix them for sure. 👍

If this can be of any help, you can run the coverage locally to make sure all code is covered

Thanks, I will use it.

This commit adds support to fetch messages
from a Zulip server stream.

Perceval can be used as
$ perceval zulip '<URL>' '<STREAM>' -e '<EMAIL>' -t '<API_KEY>'

The tests have been added accordingly and
the usage docs are also added.

Signed-off-by: Venu Vardhan Reddy Tekula <venu@chaoss.community>
@vchrombie
Copy link
Member Author

Adhering to the contributing guidelines (#incubating-repositories), this work is moved to a separate repository [1] and will be maintained over there for some time. We can open this PR again and update it when we want to merge this backend into the main module.

[1] Zulip Backend for Perceval: https://github.com/vchrombie/grimoirelab-perceval-zulip

Best,
Venu

@vchrombie vchrombie closed this Aug 3, 2021
@sduenas
Copy link
Member

sduenas commented Aug 4, 2021

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Feature Request] Add support for Zulip
4 participants