Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Switch to streaming #29

Open
wants to merge 3 commits into
base: master
Choose a base branch
from
Open

WIP: Switch to streaming #29

wants to merge 3 commits into from

Conversation

herrfugbaum
Copy link
Owner

@herrfugbaum herrfugbaum commented Dec 11, 2018

Trying to get streaming working.
Work in Progress, most tests are failing but that's only natural in this case.

  • Create a ReadStream for the input file
  • Parse CSV as a stream
  • Execute SQL on the parsed stream
  • Stream result to stdout (Works in isolation but not in this case. We need to get step 3 working.)
  • Rewrite test for streaming version

For step 3 we have two options:

  1. execute sql line by line
  2. push for example 100 lines into an array. execute sql on that array. resume with next 100 lines.

Any help is appreciated 🤗

For easier debugging I've added a debug.js file and updated the .vscode config

@herrfugbaum herrfugbaum added bug Something isn't working help wanted Extra attention is needed enhancement New feature or request and removed bug Something isn't working labels Dec 11, 2018
@herrfugbaum
Copy link
Owner Author

Got a basic version of streaming working.

The code quality is pretty bad and it introduces unwanted behavior.
Also the 100 row limit is currently hardcoded.

  1. If a file has less then 100 rows it doesn't get parsed
  2. If a file has for example 350 rows the last 50 rows wouldn't get parsed

If these issues are resolved, we can start rewriting tests for streaming.

@herrfugbaum
Copy link
Owner Author

New commit fixes the issues from the last comment (at least technically).

  1. Files that are smaller than the row limit will work now.
  2. Files that are not divisible by the row limit will not get cut off but the rest will render as a new table (cosmetic issue)

New ToDo:

  • Test if all SQL operations are working correctly
  • Rewrite tests for streaming version
  • Refactor parseFile.js
  • Check if the variable and file names are still relevant
  • Make it work with the repl (currently works only with debug.js)

@herrfugbaum
Copy link
Owner Author

A quick check shows that all SQL operations seem to work but the LIMIT operator.

@herrfugbaum
Copy link
Owner Author

Ok this introduces way more bugs than anticipated 😅

WHERE doesn't work in all cases (no idea yet why)

ORDER BY doesn't work properly because it only orders 100 rows, then the next 100 and so on.
For example a file containing the numbers 0-199 (newline separated) should result in something like

Table 1:
number
199
198
...
100
Table 2:
number
99
98
...
0

instead of

Table 1:
number
99
98
...
0
Table 2:
number
199
198
...
100

for the statement SELECT * FROM table ORDER BY number DESC

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request help wanted Extra attention is needed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant