-
Notifications
You must be signed in to change notification settings - Fork 116
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[janhq/jan#1635] feat: decouple nitro engine into a library #367
[janhq/jan#1635] feat: decouple nitro engine into a library #367
Conversation
6662320
to
01731cf
Compare
Test build on different yarn versions (platform: MacOS M1):
How to reproduce: npm i -g corepack
corepack enable --install-directory <some_dir_in_PATH>
corepack install -g yarn@<1,2,3>
make clean all |
70c20ee
to
d7344dd
Compare
13245cf
to
cba07be
Compare
cb57401
to
d85535c
Compare
Tests passed on Ubuntu (no CUDA) and Mac-Intel: https://github.com/InNoobWeTrust/nitro/actions/runs/7652932312
|
02d91bd
to
ddd1406
Compare
…vironment" This reverts commit b6727a0. Polyfill alternatives for fetch are not fully supported ReadableStream.
…erence-engine-into-a-library
This expose exit code, termination signal, and stdio for nitro subprocess This should help solving #369
ff73fbd
to
8a309a3
Compare
* Find which executable file to run based on the current platform. | ||
* @returns The name of the executable file to run. | ||
*/ | ||
export const executableNitroFile = ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What I'm seeing here is that this function expects folderPath
as the parameters, and there are a bunch of available built binaries for Nitro. In my opinion, it's not a good design.
Personally I think there should be a better way:
node-nitro
exposes a util function to return the exact binary (e.g: Windows - with CUDA, or Mac ARM64 with Metal).- Then run init, in the init function it download if no binary there/ or user can prefill it with another util function.
- If it fails, let it fail and show user why it failed.
getNvidiaConfig()
is very specific, we should only use it in utils as Nitro will support many other build (AMD with Vulkan, sycl for Intel, linux arm etc)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Can we split this logic into another PR? Since I don't know what will be the naming agreement for the release artifact (it's with/without CUDA version and some inconsistencies). So defining a single generic logic for the executable file is somewhat out of my control...😅
- In
nitro.ts
, therunModel()
function already tries to download the binaries automatically if they are not there yet. If we want to define specific binary to download then we can run the system analysis onintialize()
. After that, inrunModel()
we will have all the necessary information to download the correct one. - Not quite understand your point about the behavior when failing, can you share with me a logical flow about fail handling so I can understand better what you mean? 😅
- I'm a little dumb when it comes to AMD and Intel things. For your suggestion, I will move the
getNvidiaConfig()
out and only call it in an iterative greedy approach in detecting available hardware (loop over supported hardware and run the check for them). Then after we have thesupported-hardware
array, we will have a decision on what binary to load? 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Why
node-nitro
needs to care about the model download? - I'm ok with running analysis first and install correct Nitro, not expecting a folder with all nitro build waiting to be initialized (with opinionated name as
win-cpu/nitro
,mac/nitro
. And it should reflect that idea - For logical flow of failure: cpp compiled program can yield arbitrary error (core dump, illegal instructions, etc) that normal binding with subprocess just simply cannot understand. Those have to be handled with
node-nitro
if you think of providing surplus value here. - It's not dumb or not, it's technical engineering that we think through use cases we MIGHT need to solve, not only the one we CURRENTLY have
Hope you understand. Thank you
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- I don't quite understand this, nitro-node only helps to get nitro binaries available on the user's machine, no downloading of the model here...😅
- Ok, will go with this. But the logic will be not flexible in the early versions, need to discuss the naming schemes of nitro release archives (can we have only a single generic CUDA build without separating into different versions?)
- The scope of this PR is to first split the logic from Jan into a library here for re-use and easier to maintain as it will be tightened with nitro development. I'm not yet involved much in the architecture/build decision of nitro so for now I will just stick with the subprocess. Until nitro can be used as a system library then make a node binding later (will be much more work). Let's go with the iterative approach and create another issue for Node binding at the moment, I don't want the PR to become too big to review with unplanned things...😅
- I believe we can't handle all the unplanned things all at once or we will go nowhere. It might be the use case but not until it's confirmed to be worth our effort (user survey and confirmed to be valid first before doing). If things are not planned yet and not an agreement on the roadmap then I don't think we should spend much time over-engineering it. Just delivering features one step at a time...
For your suggestions, I'm considering and thinking about the best ways to adapt for now with minimal effort possible. But for things in the future, please share the roadmap link of the mentioned point and a defined plan so I can understand what are the steps to iteratively deliver it. I'm not a team member at the moment so I don't know well about your plans so pardon if I'm not understanding your points correctly...😅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why
node-nitro
needs to care about the model download?
I'm ok with running analysis first and install correct Nitro, not expecting a folder with all nitro build waiting to be initialized (with opinionated name as
win-cpu/nitro
,mac/nitro
. And it should reflect that ideaFor logical flow of failure: cpp compiled program can yield arbitrary error (core dump, illegal instructions, etc) that normal binding with subprocess just simply cannot understand. Those have to be handled with
node-nitro
if you think of providing surplus value here.It's not dumb or not, it's technical engineering that we think through use cases we MIGHT need to solve, not only the one we CURRENTLY have
Hope you understand. Thank you
Sorry, misunderstood the third point. I already exposed the error code and exit signals upon exit of nitro subprocess in the recent commits. But to handle core dump and crash then I'm not yet sure how node handles those cases, let me check and have the appropriate solution for the mentioned case. Initially, my assumption is that if nitro crashes or core dumped then no callback for error code or signal but a disconnect callback will be called, then we detect that and get dump information from the system...🤔
/** | ||
* Get GPU information | ||
*/ | ||
async function updateGpuInfo(): Promise<void> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this one should be in the app, as it's very Jan specific support.
For normal use case, people export the CUDA_VISIBLE_DEVICES
.
The logic to assign Nitro with the highest VRAM should not be here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure about this, do you think most people will use the library with responsibility (exporting CUDA_VISIBLE_DEVICES
) or prefer it fool-proof as default and let the library define sensible options for them? 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is sensible and what is your definition of Nitro sensible way to choose.
I don't believe any lib can understand hardware in a better way compared to the user of the lib who owns the hardware.
Even pytorch use default allocation for all GPU, and let user know that they can override which GPU to use.
What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I mean if users do not provide anything then the library will use defaults based on analysis. And for the one knowing and wanting to be specific, we let them specify it.
For now, I'll add support for env var to let users specify their preferred devices.
* @returns {(NitroPromptSetting | never)} parsed prompt setting | ||
* @throws {Error} if cannot split promptTemplate | ||
*/ | ||
export function promptTemplateConverter( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I actually think this should be in the Nitro c++ code, not here.
And it's not thoroughly cover many cases that we have
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right, but for the sake of this PR, let's just create another issue for the template converter. Gonna check the flow of C++ later and adapt this in another PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok agree
// The platform architecture | ||
//const ARCH = process.env.npm_config_arch || process.arch; | ||
|
||
const getReleaseInfo = async (taggedVersion: string): Promise<any> => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do you think that using node-gyp
to install and build nitro locally in npm install
would be better?
I think this one is useful
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer to skip node-gyp at this stage. As I understand, the primary use-case of node-gyp is to build node native modules (eg. when we want to build nitro into *.so files and use node binding to control nitro directly instead of going through a subprocess wrapper).
Using node-gyp for just getting the binaries is somehow overkill, and supporting programmatic build on every supported platform is another story. At this point I'm not confident enough that I can support it without people raising issues (build env involves PATH, OS versions, available tooling, etc..). For now, I don't think stepping into it will make good use of the time, more issues will happen...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then basically, what is the point of wrapping Nitro binaries in a big Nitro-Node package?
Nitro itself is just a binary in 3mb with OpenAI compatible API pointing to llama.cpp with 1 single command and 2 fetch POST to use.
I understand your point but given the state of this, if node-nitro
does not provide anything surplus to existing one, I don't think it should exist as it adds a lot of overhead and grey-box.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The intention for now is simple, to ease the usage of nitro in Nodejs programs without too much boilerplate code repeated and bloated between projects. It's not about overhead but rather about practical usage, it's just a simple wrapper for now, with no intention to bring surplus value to what nitro already has but rather improve developer experience on making a client program that utilizes nitro. 😅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For building nitro during installation if no binary is available for the platform, let's move it to another issue as I believe it will be rather complex. Let's deliver what is useful for gaining more adoption of nitro from fellow developers then improve it later with more features...😅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI: I'm looking at a reference build script at the link below
https://github.com/withcatai/node-llama-cpp/blob/master/src/cli/commands/BuildCommand.ts
They also don't use node-gyp but rather have a build script themself. I also prefer it this way, to not rely on node-gyp if not necessary means no dependency on Python and other native libs required by node-gyp, then we can have a less demanding library => more usage from other projects => more visibility, and more contribution back to nitro and this library. 😉
/** | ||
* Get the system resources information | ||
*/ | ||
export async function getResourcesInfo(): Promise<ResourcesInfo> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this one is how Jan is USING Nitro, should not be something here to be reuse
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one is for getting the number of CPU cores on the machine to provide a default value if not specified when running nitro. It's also useful for ones who use nitro-node, I believe.
Overall, thank you for the great job.
|
@InNoobWeTrust @hiro-v I would like to proceed and merge this pull request. Any further updates will be in separate PRs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Let's resolve the pending ones in the next PRs.
Current progress
Next step is update janhq/core to npmjs and use the resolved package there instead of submodulefor different OSesand add the tarballs of npm packages to releaseFrom janhq/jan build step for extensions, simply download nitro-node fromgithub releasenpmjs withnpm pack @janhq/nitro-node --target_arch=x64 --target_platform=linux --pack-destination ../electron/pre-install/
@janhq/core
(ideal) or using the minimal interfaces and concepts from@janhq/core
@janhq/jan
, will import@janhq/nitro-node
for interacting with nitro.How to build
From root directory of repo
The plan:
Build the extension andput tarballs into release together with nitropublish to npmjs under alpha/betaFrom Jan, will need to add some kind of version.txt for downloading correct version of nitro-node and download it to put into ./electron/pre-install/@janhq/nitro-node
to be independent library@janhq/jan
, integrate@janhq/nitro-node
Issue:
Currently I'm cloning https://github.com/janhq/jan as a submodule in order to build the janhq/core package. The version on npm seem to be outdated and the imports do not work although the version is displayed to be the same (https://www.npmjs.com/package/@janhq/core) => can you update it on npmjs or the team prefer to coupled the versions tightly with github repo?