Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: add chatbot tutorial #3382

Merged
merged 17 commits into from
Sep 14, 2021
Merged

docs: add chatbot tutorial #3382

merged 17 commits into from
Sep 14, 2021

Conversation

alaeddine-13
Copy link
Contributor

@alaeddine-13 alaeddine-13 requested a review from a team as a code owner September 10, 2021 09:29
@github-actions github-actions bot added size/L area/docs This issue/PR affects the docs area/housekeeping This issue/PR is housekeeping labels Sep 10, 2021
@codecov
Copy link

codecov bot commented Sep 10, 2021

Codecov Report

Merging #3382 (aa726e4) into master (d37e6ab) will increase coverage by 0.53%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #3382      +/-   ##
==========================================
+ Coverage   89.71%   90.24%   +0.53%     
==========================================
  Files         152      152              
  Lines       10840    10840              
==========================================
+ Hits         9725     9783      +58     
+ Misses       1115     1057      -58     
Flag Coverage Δ
daemon 45.55% <ø> (-0.02%) ⬇️
jina 90.23% <ø> (+0.54%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
jina/peapods/runtimes/jinad/__init__.py 91.42% <0.00%> (-1.91%) ⬇️
jina/types/document/__init__.py 96.07% <0.00%> (+0.78%) ⬆️
jina/helper.py 83.33% <0.00%> (+1.81%) ⬆️
jina/peapods/runtimes/gateway/http/app.py 92.40% <0.00%> (+2.53%) ⬆️
jina/clients/request/__init__.py 100.00% <0.00%> (+5.00%) ⬆️
jina/clients/request/helper.py 83.92% <0.00%> (+5.35%) ⬆️
jina/clients/request/asyncio.py 88.88% <0.00%> (+5.55%) ⬆️
jina/clients/base/__init__.py 92.42% <0.00%> (+18.18%) ⬆️
jina/types/document/generators.py 93.33% <0.00%> (+36.00%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e7d47c0...aa726e4. Read the comment docs.

@github-actions
Copy link

github-actions bot commented Sep 10, 2021

Latency summary

Current PR yields:

  • 😶 index QPS at 1118, delta to last 2 avg.: +3%
  • 😶 query QPS at 45, delta to last 2 avg.: +1%
  • 😶 dam extend QPS at 39289, delta to last 2 avg.: +0%
  • 😶 avg flow time within 1.7896 seconds, delta to last 2 avg.: -1%
  • 😶 import jina within 0.5008 seconds, delta to last 2 avg.: +3%

Breakdown

Version Index QPS Query QPS DAM Extend QPS Avg Flow Time (s) Import Time (s)
current 1118 45 39289 1.7896 0.5008
2.1.0 1154 46 41299 1.7953 0.4554
2.0.24 1006 41 37349 1.8355 0.5155

Backed by latency-tracking. Further commits will update this comment.

Copy link
Contributor

@tadejsv tadejsv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code in this tutorial is currently too obtuse, most of the lines are just about downloading the dataset and not actual flow/executor things.

Can you refactor it using huggingface datasets dataset (I think it should be possible), and add some line breaks for readability in the code snippets.

Also, assume that the users are very lazy, when possible try to show what the output of the code snippets would be (huggingface course does this well, see for example https://huggingface.co/course/chapter3/2?fw=pt)

docs/tutorials/chatbot.md Outdated Show resolved Hide resolved
docs/tutorials/chatbot.md Show resolved Hide resolved
docs/tutorials/chatbot.md Outdated Show resolved Hide resolved
docs/tutorials/chatbot.md Outdated Show resolved Hide resolved
docs/tutorials/chatbot.md Outdated Show resolved Hide resolved
@tadejsv
Copy link
Contributor

tadejsv commented Sep 10, 2021

My bad, the HF dataset is not the same one as the one used by kaggle. Still, check if the HF one can be a good substitute for Kaggle one

docs/tutorials/chatbot.md Outdated Show resolved Hide resolved
Copy link
Contributor

@tadejsv tadejsv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've left some comments that will improve readability, but we really need to get rid of all the downloading code - it's complexity is just dominating over this whole tutorial.

Either we figure out how to use HF datasets (I would create one myself, but am not sure about licensing), or we add instructions in pre-requisites for how to manually download the file from kaggle - that would be much nicer.

docs/tutorials/chatbot.md Show resolved Hide resolved
docs/tutorials/chatbot.md Outdated Show resolved Hide resolved
docs/tutorials/chatbot.md Outdated Show resolved Hide resolved
docs/tutorials/chatbot.md Outdated Show resolved Hide resolved
docs/tutorials/chatbot.md Outdated Show resolved Hide resolved
docs/tutorials/chatbot.md Outdated Show resolved Hide resolved
docs/tutorials/chatbot.md Outdated Show resolved Hide resolved
@github-actions github-actions bot added size/L and removed size/XL labels Sep 10, 2021
@alaeddine-13
Copy link
Contributor Author

I've left some comments that will improve readability, but we really need to get rid of all the downloading code - it's complexity is just dominating over this whole tutorial.

@tadejsv I think authors of the tutorial opted for downloading each time to make usage faster and easy. Also keep in mind that the downloaded dataset is hosted by jina not kaggle: https://static.jina.ai/chatbot/dataset.csv
I have reduced the size of download_data so that's no more dominating, can you re-check ?

@github-actions github-actions bot added size/L and removed size/XL component/client area/testing This issue/PR affects testing executor/meta area/cicd This issue/PR affects the cicd pipeline area/housekeeping This issue/PR is housekeeping area/core This issue/PR affects the core codebase area/helper This issue/PR affects the helper functionality area/entrypoint This issue/PR affects the entrypoint codebase component/type component/executor component/jaml component/flow labels Sep 14, 2021
Copy link
Contributor

@tadejsv tadejsv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great job, looks much better now. Just some final minor changes

docs/tutorials/chatbot.md Outdated Show resolved Hide resolved
docs/tutorials/chatbot.md Outdated Show resolved Hide resolved
docs/tutorials/chatbot.md Outdated Show resolved Hide resolved
Copy link
Contributor

@tadejsv tadejsv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good 👍

tadejsv
tadejsv previously approved these changes Sep 14, 2021
Copy link
Member

@alexcg1 alexcg1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm...I'm unable to do line comments in the file. Here's my feedback:

We will build a fuzzy search demo on the source code
Change to: We will build a fuzzy search demo for source code

Also re-order tutorials in docs/index.md:

  1. Chatbot
  2. Executor
  3. Executor on GPU
  4. Practice learning

docs/tutorials/chatbot.md Outdated Show resolved Hide resolved
docs/tutorials/chatbot.md Outdated Show resolved Hide resolved
docs/tutorials/chatbot.md Outdated Show resolved Hide resolved
@alaeddine-13
Copy link
Contributor Author

Hmmm...I'm unable to do line comments in the file. Here's my feedback:

We will build a fuzzy search demo on the source code
Change to: We will build a fuzzy search demo for source code

Also re-order tutorials in docs/index.md:

  1. Chatbot
  2. Executor
  3. Executor on GPU
  4. Practice learning

done

Copy link
Member

@alexcg1 alexcg1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍

@alaeddine-13 alaeddine-13 merged commit 660cb3e into master Sep 14, 2021
@alaeddine-13 alaeddine-13 deleted the docs-chatbot-tutorial branch September 14, 2021 08:58
@ghost
Copy link

ghost commented Sep 14, 2021

Hmmm...I'm unable to do line comments in the file. Here's my feedback:

We will build a fuzzy search demo on the source code

Change to: We will build a fuzzy search demo for source code

Also re-order tutorials in docs/index.md:

  1. Chatbot

  2. Executor

  3. Executor on GPU

  4. Practice learning

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/docs This issue/PR affects the docs size/L
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants