Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: "What's next", for discussion #2559

Closed
wants to merge 1 commit into from
Closed

Conversation

max-sixty
Copy link
Member

@snth asked where to make comments on this — the doc we used for discussing "What's next" on the call at the weekend.

So adding here — I'm not planning to merge it in its current state, but we can use this PR for any discussions. (And very open to merging something like this, probably more developed).

@snth asked where to make comments on this — the doc we used for discussing "What's next" on the call at the weekend.

So adding here — I'm not planning to merge it in its current state, but we can use this PR for any discussions. (And very open to merging something like this, probably more developed).
Comment on lines +89 to +96
- Fork Rill

- Rill seems to be an especially good match for PRQL — enabling quick
exploration and immediate results.
- However, I didn't receive a response from the team in my latest email
(though I can follow up again). We had a productive Zoom meeting previously.
- We could consider forking the project if necessary, and then using any
success there to attempt to merge it to mainline.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I noted, forking Rill would be a temporary boost to the project, but would not showcase strengths of PRQL as it would be hard to integrate good highlighting and LSP support.

@snth
Copy link
Member

snth commented May 11, 2023

Thanks for this. I've had some thoughts written down on Hackmd for a few days but have not been finding the time to transfer them (since I want to edit and expand). I will do my best to get these across as soon as I can.

Comment on lines +73 to +75
- The R extension looks great. I don't have much experience with R and haven't
explored it extensively. Do others have opinions? Is the R audience enough
to make a project-wide bet on R?
Copy link
Member

@eitsupi eitsupi May 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my personal opinion, the most valuable aspect of prqlr is the ability to easily embed PRQL code blocks in Quarto documents.
This also can be done with IPython magic with pyprql, but it is interpreted as a Python code block and does not look very good. (quarto-dev/quarto-cli#4227)

Embedding PRQL code blocks into Quarto documents works very well, as in https://github.com/eitsupi/querying-with-prql, but the problem is that there is no way to run PRQL code interactively.
SQL code blocks can be executed interactively in the RStudio IDE, but if you want PRQL code blocks to work the same way, we will need to add functionality to the RStudio IDE. (Related to rstudio/rstudio#12767)
Also, not even SQL code blocks are currently executable in VSCode. (quarto-dev/quarto#26)
I am thinking it would be more realistic to wait for Quarto to be able to shape IPython Magic's embedded code blocks into pretty code blocks rather than making PRQL code blocks interactively executable in RStudio or VSCode.

@snth
Copy link
Member

snth commented May 15, 2023

Hi folks,

sorry about the long absence. I will start with a general post with some comments and thoughts and then maybe will open a review with comments on specific points in the text (if I feel that's still going to add value).

@max-sixty 's discussion text frames the problem of adoption in terms of Users, Tools and Developers. My sense is that we should aim for users, probably through availability of PRQL in tools. Developers would make that possible but are probably harder to attract/recruit. I like how Josh Wills described DuckDB in the Data Stack show podcast (starting at 46:20) as an "Experiential Good", i.e. something where the experience is so good that anyone that tries it starts becoming an advocate for it. I feel similarly about PRQL and feel that if we could just get people to try PRQL then adoption would follow on its own over time. In that episode they also talk about how Josh was singing the praises of DuckDB long before it was as cool as it is now and I think the experiential good point was about how, even though the number of users was small at the start, the experience had such a pofound effect on people that the growth came eventually, it had to come. I still hold out hope in a similar vein for PRQL.

Overall, I think we should continue to work on making the language as good and useful as it can be while trying to enable integration into more tools. I got a few people to open experimental branches of PRQL into their tools (BuenaVista, Evidence.dev) but failed to convert those into full integrations because of lack of time in my part. That is my bad and a lesson I'm taking away from that is that I will try to create more visibility around this and open Issues on Github for these so others can assist or take over should I drop away.

I feel that building our own tool would be a distraction. It's hard enough for tools like dbt or Rill to get traction and becoming another competitor in that space would just lead to more fragmentation and I also doubt that we have the resources / manpower to be very competitive. I also think that there are still things that can be improved in the language to make the user experience better first so I would start (continue) there.

DuckDB is amazing and the pace at which they continue to innovate, both on the db tech as well as their SQL dialect, is staggering. I therefore totally see the attraction of focusing our efforts on just targeting them as a backend (as I did in my Pi Day blog post or in prql-query). However for the PRQL language I believe that would be a mistake, at least at this point. We've made quite a song and dance about PRQL's database agnosticism as being one of its core features/benefits. To reverse course on that point would not set a good precedent for building trust amongst users that we will stick to our vision and roadmap.

One idea I had to get around the burden of supporting multiple SQL dialects is to utilise the sqlglot library. It's a great pity that this is written in Python instead of Rust. I thought that here perhaps a bit of funding could be well spent. The idea is to have sqlglot translated to Rust. Perhaps with GPT4 assistance this might be a quicker job than it would be otherwise (I have no personal experience of this but that's the impression I'm getting from LLM chatter). It's quite a well contained project and should be relatively easily testable so it might make a good candidate for a GSOC project? Otherwise maybe with a bit of funding someone could be paid to do this as a freelance job? Toby Mao, the sqlglot creator, might also be interested in this since I think sqlglot forms quite an integral part of his sqlmesh project so that would also benefit from the speed and reliability improvements of having sqlglot in Rust.

I think dbt continues to be a worthwhile integration to pursue. sqlmesh is the new entrant in that space which is also picking up quite a bit of interest. I haven't had time to investigate it properly yet myself but from what I've seen so far it looks good. A sqlmesh integration could also be great. Especially if we could cooperate on sqlglot.

@snth
Copy link
Member

snth commented May 16, 2023

Update on my previous post: based on this tweet Toby Mao might not be that enthusiastic to rewrite sqlglot into Rust:

https://twitter.com/Captaintobs/status/1658351633099812865

@aljazerzen
Copy link
Member

We've made quite a song and dance about PRQL's database agnosticism as being one of its core features/benefits. To reverse course on that point would not set a good precedent...

I wouldn't say we should start dropping support, just not support new features for dialects that require additional work.

The main point here is that while PRQL right may be interesting to "language people" and people who are really frustrated with SQL, it does not provide a good workflow to any user group. Our recommended way of using PRQL with real data right now, would probably be Jupyter magic, which does not have syntax highlighting or error detection. There are different problems with VS code extension and pq, but my point is this:

we should focus on one use-case, and perfect that

This does not mean that we drop support for other use-cases, even contrary: many of the fixes and features we will implement for the focused use-case will be applicable for other use cases too. My thinking here is that we cannot develop the tooling for every use-case where PRQL would excel: there is just too many of them. We need a MVP that people can use, right now, without copy-pasting from the Playground.

@tobymao
Copy link

tobymao commented May 16, 2023

Update on my previous post: based on this tweet Toby Mao might not be that enthusiastic to rewrite sqlglot into Rust:

https://twitter.com/Captaintobs/status/1658351633099812865

Although there's little chance I'm going to rewrite sqlglot in rust, I'm happy to collaborate. For example, you could leverage sqlglot's multi dialect transpilation, optimizations (predicate/projection pushdowns, boolean algebra) etc. Let me know if you're interested in chatting :)

@max-sixty
Copy link
Member Author

The main point here is that while PRQL right may be interesting to "language people" and people who are really frustrated with SQL, it does not provide a good workflow to any user group. Our recommended way of using PRQL with real data right now, would probably be Jupyter magic, which does not have syntax highlighting or error detection. There are different problems with VS code extension and pq, but my point is this:

we should focus on one use-case, and perfect that

I strongly agree with this (where "one use-case" might be a couple, but it's driving towards a use-case...)

@max-sixty
Copy link
Member Author

Something @snth brought up on the recent dev call — a TUI for querying data, similar to the playground:

  • Could work with local data, potentially data from an HTTP source
  • Would rely on DuckDB for all computation (so less flexible than the prql-query vision)
  • Similar to the playground, it could show results on every keystroke. So unlike a prqlc compile | duckdb approach, it could allow for easy iteration; maybe keeping some history of queries
  • Visidata is a comparison, implemented in python

This has the same drawbacks as other proposals to build our own tools — users have to change both the language and the tool they're using. But the TUI might attract technical users, which are likely a good set of early users. And it might uniquely appeal to PRQL's advantages for exploratory work. Having it in Rust would give access to a bunch of TUI crates.

@aljazerzen
Copy link
Member

Moved to a HackMD document.

@aljazerzen aljazerzen closed this May 30, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants