-
Notifications
You must be signed in to change notification settings - Fork 8.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Nr. of devices needed #38
Comments
INFO:jax._src.xla_bridge:Unable to initialize backend 'rocm': NOT_FOUND: Could not find registered platform with name: "rocm". Available platform names are: CUDA this what you get? |
i did put 1 instead of 8 |
I keep getting same error : PermissionError: [WinError 32] The process cannot access the file because it is being used by another process: 'D:\dev\shm\tmpp53ohpcl' |
I have the same issues, is there a way to resolve this? |
same issue even all requirements install. I am using 8 GPUs |
I have 2 GPUs and everything installed ok as well. |
in run.py, I changed line 60: (I have 1 3090) |
Ok got a little further this time! Traceback (most recent call last): I have 2 Quadro 5000s, I guess we do not have enough vRAM doh. |
I'm at the same point, GPT told me /dev/shm is a ramdisk, which means we don't have enough ram, not vram. I have 64G, not sure how much we need... would 128 be enough? |
I have 128 GB on this rig, with the two cards its like 32 GB, this is why I assumed vRAM. Maybe I could be wrong. |
bummer... guess we'll just have to wait for gguf... |
Possibly. I might spin up a runpod, or wait for GGUF, I was reading people needing 8 GPUs. |
after changing the mesh to (1, 6) i get this error:
looks like it doesnt like 6 either |
looks like i have to set
in the TransformerConfig to the number of devices as well |
get it up and running? |
did it work after? |
@yarodevuci still downloading weights. i was under the impression that the test wiould download stuff (looks like i'm spoiled by the huggingface api which does it) will report tomorrow. right now it tells me 17 more hours (dont know why so long, am on 750mbit but magnet download is painfully slow) |
Im seeding (again), took me most the evening last night to download, and I have 2000mbit download |
My system has 192GB of RAM, I also encountered same. |
@ad1tyac0des it creates temp folder with over 300GB in it, do you have that space on the hard drive? |
Is anybody here who saw live presentation where X developers run it using exact commands or we all trying to test it for them? |
坑爹,为了下载它,花费了我一天的心血 |
I succeeded increasing space and get rid of this error but in exchange to do that I end up with system crashed instead, so I will give up for now. I don't have enough RAM to run Grok-1 neither enough money to upgrade my hardware" |
I had about 100GB of storage left, but at the moment when the error occurred, my system's RAM was completely utilized. This seems to be the reason why the program stopped. It looks like the problem was due to the high RAM usage rather than storage space. |
am at 272/300 gb right now. excitement starts to kick in, lets hope this thing runs. only having 6x 4090 (144GB VRAM) and 512GB RAM, if this isnt enough to at least run it, regardless of the speed, then something is off |
ok, got a little further but still no cigar:
|
But my macbook m1 pro with 16/512gb, |
It is probably not. I have 4 A100 and 512gb per node as well and I am not sure I can run it. It's stuck at loading checkpoints for a while now. |
you should install jaxlib for cuda, so that your 8 GPUs can be detected. or you can set local_mesh_config=(1, 1), and grok will run on cpu. |
After did it i got an error zsh: killed python run.py |
We are talking about machines with 512gb of RAM and hundreds of gb of VRAM not being able to run it, not in a laptop. You will have to wait for a WAY smaller version of it to run in a small machine. |
I have Imac with processor 3,6 GHz 10-Core Intel Core i9 Graphics AMD Radeon Pro 5300 4 GB and memory 16 GB 2667 MHz DDR4 ))) What i need to change? |
)))) You need a real data center gpu compute node with at least 8 x A100 with 80gb to run grok at this point. I doubt that any quantized version would fit on a Mac anytime soon, but who knows? ))) |
raise ValueError(f'Number of devices {len(devices)} must equal the product ' |
You need to change all.Ordinary civilian equipment cannot run this.Maybe Amazon cloud server can run grok-1. But the price will definitely be high |
Running
python run.py
on a single Nvidia GPU it fails withValueError: Number of devices 1 must equal the product of mesh_shape (1, 8)
Can the nr of devices be adjusted to 1 only?
The text was updated successfully, but these errors were encountered: