Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to use SQL function in SearchLight? #18

Open
huaxk opened this issue Jul 13, 2019 · 20 comments
Open

How to use SQL function in SearchLight? #18

huaxk opened this issue Jul 13, 2019 · 20 comments

Comments

@huaxk
Copy link

huaxk commented Jul 13, 2019

How to use SQL function in SearchLight? Can give some examples.

@essenciary
Copy link
Member

@huaxk Not sure what you mean - can you be more specific?

@huaxk
Copy link
Author

huaxk commented Jul 16, 2019

I know:
SearchLight.count(Article) generate SQL QUERY: SELECT COUNT(*) AS __cid FROM "articles"

If i want to use SQL function(SUM or other function) to generate SQL like: SELECT SUM(id) AS __sid FROM "articles",it seems there is no funciton: SearchLight.sum().

I try: SearchLight.find_df(Article, Q(columns=[C("sum(id)", raw=true)])), it works!
But not direct support SQL function, nor elegant! Is there a better way to use SQL function?

@essenciary
Copy link
Member

@huaxk I understand now, thanks for the details. I'll take a look and see what would be the best way to support more aggregated functions.

@huaxk
Copy link
Author

huaxk commented Jul 22, 2019

Can I try to implement this feature? My project needs to use this feature.

@essenciary
Copy link
Member

@huaxk Of course, please do! Should be very similar to the current implementation for COUNT.

@huaxk
Copy link
Author

huaxk commented Jul 23, 2019

Is there unit test code in SearchLight source code? If there is a test that will help to write new code.

@essenciary
Copy link
Member

@huaxk Nothing usable at the moment. You can add them in the test/ folder. I'm working on setting up some form of CI as they'd have to be run against multiple DB backends.

@maherkhalil07
Copy link

why there is no documentation to use SearchLight?

@essenciary
Copy link
Member

Because I didn't have time

@maherkhalil07
Copy link

i can help you if you guide me

@essenciary
Copy link
Member

@maherkhalil07 Thank you - I'm now working on a major internal refactoring and API cleanup. It should be the last major rewrite before v1 and having a stable API so it will be a great time to set up the docs. I'm also adding a lot of tests and in-code documentation. Once that will be pushed, the docstrings can be used to write guides and set up proper docs pages.

@maherkhalil07
Copy link

when shall that be expected?
please note, without database and ORM clear documentation, Genie can not be used and also easy way to compile Genie

@essenciary
Copy link
Member

essenciary commented Nov 9, 2019

It entirely depends on how much time I can allocate in-between my paid work. "A few months" is a reasonable estimate.

However, do keep in mind that the code is documented thoroughly so you can check the documentation for the methods - as well as use Julia's help system.

Also, Genie can be used without SearchLight. SearchLight is not a requirement for Genie. You can roll-out your own model implementation using existing Julia database libraries (MySQL.jl, SQLite.jl, etc) if you don't want to use SearchLight.

Regarding compilation - I'm not sure why you ask for this. Genie runs on top of Julia which is JIT-compiled. AOT compilation for Julia is experimental and not something done routinely. However, this is beyond the scope and focus of Genie, but you can research it and attempt it, there are some articles on the net.

@maherkhalil07
Copy link

can you include Genie with this benchmark web frameworks evaluations
https://www.techempower.com/benchmarks/#section=data-r18
it will be highly appre3cited for speed evaluation

@iskyd
Copy link

iskyd commented Jan 30, 2022

Any updates on this issue? @essenciary @huaxk

@FrankUrbach
Copy link
Contributor

If we come back to the question from the beginning of this issue, I don't think we should implement sql functions in SearchLight because this is very easy to implement for somebody who needs this. For example a count function would be find(Author) where a dataframe will come back and the full function would be nrow(find(Author)). Therefore it isn't worth to think about such implementations. This would be more a convenient thing than really needed.
In my opinion the ORM is more meant to be a convenient way to store struct data into a database without to hassle with error prone sql statements. And in my eyes this would be against the philosophy of Julia where specialized functions will orchestrated to a swinging thing 😄.

@iskyd
Copy link

iskyd commented Feb 1, 2022

This is very inefficient and non suitable for some use-cases. Why would you load the whole data in a dataframe and then count the elements while you can use an optimized SQL function?

@FrankUrbach
Copy link
Contributor

Yes, it is. But have you so often the need for such a function? As I worked with much data I didn't had needed such things that much. I argument from a point of view as the maintainer of one adapter for Searchlight. Such function has to be implemented in each adapter to fulfill your wishes. Is it worth to do so or wouldn't it be better to implement this things in a helper modul by yourself. You can pass each sql function you want to Searchlight. This isn't a heavy exercise and you will get your result immediately. If Adrian as the maintainer of SearchLight.jl thinks it is worth to do so I can assist to implement this. But from my standpoint I wouldn't do that until a mass of people request such things.

@essenciary
Copy link
Member

I agree with @iskyd that these must definitely be implemented and executed at db level - for any statistical values (min, max, etc) the DB is highly optimized (assuming indexes) and we avoid querying and transferring potentially a huge result to process on the Julia side.

Then as @FrankUrbach says, because these are implementation specific, they need to be added into each database backend. And of course, it's a good question what should we add out of the multitude of options. I guess statistical functions would be a good start, there are lots of use cases for these in regular data apps.

@FrankUrbach
Copy link
Contributor

Ok. I recognize that you want implement such functions in SearchLight. I think the easiest way to do it would be to create a branch in the main repository and define there what functions you think about it and what mathematical background should be used. The interesting task will be to find out how each database implements the function. A user of SearchLight will expect that each db backend delivers the same answer. It will be an interesting journey to find out more about it. But the first thing is to define the requested functions with its names in SearchLight.jl. Then each adapter can be developed accordingly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants