Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Q: ML.NET within SQLCLR #2571

Closed
grahamehorner opened this issue Feb 15, 2019 · 5 comments
Closed

Q: ML.NET within SQLCLR #2571

grahamehorner opened this issue Feb 15, 2019 · 5 comments
Labels
question Further information is requested

Comments

@grahamehorner
Copy link

I would like to run/train a ML.NET model from with in SQL server as a SQLCLR; at present the ML.NET is failing with an obsquer error that looks to be related to security, is/will it be possible to run/train ML.NET models from inside SQL stored procedures close to the data source.

@endintiers
Copy link
Contributor

@grahamehorner From an architecture perspective it doesn't sound like a good idea. You are talking about using expensive SQL CPU resources to do training, which often uses extreme amounts of CPU. Some ML algorithms use GPUs instead of CPUs if they are available, in future we will likely have algorithms that depend on Quantum co-processors (at least during training). You may want to build pipelines that do things that are not allowed in the SQLCLR sandbox (like file I/O?).
If you can build a training pipeline in .NET Standard (1.6 is supported in SQLCLR?) then it could work, but why not just do the training outside - it will be much easier. You could do the predictions in SQLCLR in some cases (depending on the pipeline code being SQLCLR compatible).
SQLCLR is a sandbox environment with limitations on how you can interact with resources outside it. Training could work but would require the pipeline(s) and algorithm(s) to only do things allowed by the sandbox.

@grahamehorner
Copy link
Author

grahamehorner commented Feb 18, 2019 via email

@endintiers
Copy link
Contributor

endintiers commented Feb 18, 2019

@grahamehorner Yeah, that makes sense. I have had similar requirements myself (pre ML.NET).

Have you tried to use ML.NET in SQLCLR yet? If so what problems did you encounter? (save me some pain trying it).

Have you tried using https://docs.microsoft.com/en-us/sql/advanced-analytics/what-is-sql-server-machine-learning?view=sql-server-2017, or the 2019 version of the same? That's only R or Python so far though and in any case because it's an external sandbox the data will be unencrypted when passed, although staying in the same machine/VM.

Some of the diagrams for that tout 'keep your data encrypted' but I don't think that's actually how it works. The subtleties don't always make it to the marketing department of the VLCC :-). It is pretty safe though, I'm not sure how you could hack that without taking the server's O/S first....

The junior equivalent would be to just run your training on the SQL Server machine in a separate process. No extra tech needed and there is no network traffic (using shared memory), so it is as safe as your database server, and arguably as safe as something running in SQL extensibility framework

@glebuk
Copy link
Contributor

glebuk commented Feb 21, 2019

@grahamehorner,
This is tricky but possible IFF you can get .NET standard 2.0 (ML.NET is written to it) to run under the custom version of CLR that SQL uses.
Then you'll have to Jump through at least several hoops:

  1. Merge your custom DLL with all its dependencies (such as ml.net) using ILMerge into a single dll that you can add as SQL function.
  2. Ensure that unmanaged dependencies, if any are found by the SQL process
  3. Create a custom unsafe SQL CLR function
    Please let us know if you get it to work!

@glebuk glebuk added question Further information is requested answered labels Feb 21, 2019
@artidoro
Copy link
Contributor

artidoro commented Jul 2, 2019

I am closing for lack of activity, I am also linking this PR that introduces initial specs for a SQL data loader:
#3857

@artidoro artidoro closed this as completed Jul 2, 2019
@ghost ghost locked as resolved and limited conversation to collaborators Mar 24, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

4 participants