Skip to content
This repository has been archived by the owner on Apr 5, 2024. It is now read-only.

[Publish] processing speed #71

Closed
3 tasks
anuveyatsu opened this issue Feb 1, 2018 · 6 comments
Closed
3 tasks

[Publish] processing speed #71

anuveyatsu opened this issue Feb 1, 2018 · 6 comments
Assignees
Milestone

Comments

@anuveyatsu
Copy link
Member

anuveyatsu commented Feb 1, 2018

After pushing a dataset I have to wait for some time before my data gets processed on datahub, e.g.:

100kb:

needed 36 sec after link displayed
needed 35 sec after link displayed(person 2)

1mb:

needed 45 sec after link displayed
needed 42 sec after link displayed(person 2)

150mb:

needed 40 hours after link displayed

How to reproduce

Expected behavior

  • processing small dataset 100Kb | processing time < 10 sec
  • processing medium dataset 1Mb | processing time < 30 sec
  • processing big dataset 1Gb | processing time < 10 min
@AcckiyGerman AcckiyGerman changed the title [Push] push speed [Push] processing speed Feb 7, 2018
@AcckiyGerman
Copy link
Contributor

Renamed title from "push speed" to "processing speed" - the push speed itself is fast enough (do 2.5 Mb/sec on my pc, which is my internet provider limit)

@AcckiyGerman AcckiyGerman changed the title [Push] processing speed [Push][Accembler] processing speed Feb 7, 2018
@zelima zelima changed the title [Push][Accembler] processing speed [Publish] processing speed Feb 8, 2018
@zelima zelima self-assigned this Feb 9, 2018
@zelima
Copy link
Collaborator

zelima commented Feb 26, 2018

Think < 10 seconds is really unrealistic goal here. We need to rethink and define the realistic goals
Results after latest changes:

  • 100KB - needs 40 seconds for full process
    • 5 seconds for push
    • 35 seconds from link displayed in CLI until page reload
  • 1mb - needs 45 seconds totally
    • 8 seconds for push
    • 37 seconds from link displayed in CLI until page reload

@AcckiyGerman
Copy link
Contributor

AcckiyGerman commented Feb 27, 2018

100KB and 1mb are x10 size difference, but only 35 vs 37 = 2 sec processing time difference.
I assume there is some time-consuming operation, that is not related to the data size.

@AcckiyGerman
Copy link
Contributor

@zelima consider using pypy instead of python interpreter on the server - it could be twice faster!

@AcckiyGerman
Copy link
Contributor

@akariv WDYT about pypy interpreter? ^^^

@zelima zelima modified the milestones: Sprint - 23 Apr 2018, Backlog May 3, 2018
@anuveyatsu
Copy link
Member Author

Closing as FIXED since we have improved our processing speed and nobody has reported any speed related issues.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

4 participants