Getting Petals to run on macOS #147

PicoCreator · 2022-12-11T04:19:26Z

The primary motivation, is

to get as much high bandwidth memory, in a low cost way (thanks to its unified memory model)
to be easily used for training / inference
its probably gonna be slower then 3090's (i have no idea), but i dun think thats the point here
could potentially also be used with large number of "student laptops"

As with the latest beta pytorch has included optimisations for m1 metal GPU

https://pytorch.org/blog/introducing-accelerated-pytorch-training-on-mac/

This present an interesting possibility of scaling up on more easily & affordably, for example. To hit 352GB of memory...
(and assuming up to 75% of a Mac's memory is allocated to GPU, you could in theory go 75%+, but I suspect we need at-least 25% for OS, and filesystem operations)

Number of nodes: 4
Ram allocated per node: 96GB (75% of 128GB)
Upfront cost of nodes: $23,200.00 ($5,800.00 / node)
Max KWh: 0.86 (0.215 KWh / node)

However if you were to try build this using A100 for example

Number of nodes: 5
Ram allocated per node: 80GB
Upfront cost of nodes: $65,000.00 ($13,000.00 / node)
Max KWh: 1.5 (0.300 KWh / node)
Price & Energy usage exclude overheads for CPU, RAM, Motherboard, Storage, Cooling, and networking

Also as outlined, alternatively would be 30 student laptops/mac-mini ...

Number of nodes: 30
Ram allocated per node (12GB, 75% of 16GB)
Upfront cost**: $33,000.00 ($1,100 / node)
Max KWh**: 4.5 (0.150 KWh / node)
** not that it matters in this case

Making it possibly one of the most accessible way for students, to setup a private swarm, and try training on their own hardware in a datalab.

justheuristic · 2022-12-21T18:30:09Z

Sorry for taking so long to respond, we're a bit overwhelmed r/n, will respond within the next 24 hours

PicoCreator · 2022-12-22T08:20:43Z

No worries, its the holiday season =)

Have a Merry Christmas
( feel free to take a few days, instead of 24 hours )

justheuristic · 2022-12-26T19:15:57Z

Thanks!

Should you run on M1?

I found a guy with an M1 Max macbook pro to run some compute tests. Surprisingly, M1 is competitive for autoregressive inference. It's still 2.5 times slower than an A6000, but way more energy efficient. For training, the comparison is less favourable, probably because you need more raw tflops, not just fast memory. So, surprisingly, yes, that makes sense.

Can you run on M1?

The current status is "you probably can, but it will require tinkering"
The shortest path is:

pip install torch with m1 support (e.g. this tutorial)
install go 1.18 or newer https://go.dev/doc/install -- latest is fine
- check with: go version
build p2pd - check that it builds fine
- git clone https://github.com/learning-at-home/go-libp2p-daemon
- cd go-libp2p-daemon/p2pd
- go build .
- check for p2pd in your local directory
install petals normally
reinstall hivemind with p2pd:
- pip install --global-option="--buildgo" https://github.com/learning-at-home/hivemind/archive/master.zip
test that networking code works in a local python / jupyter:
- import hivemind
- dht = hivemind.DHT(start=True)
- if it doesnt crash, you win :)
  , and a lot
  After that, you should be able to run Petals normally. To be more realistic: you will probably bump into random problems. If you reach steps 3 / 4 / 5, we can help you go through these. On the plus side, if you figure this out, other people with apple devices will be able to follow in your footsteps :)

Notes:

** not that it matters in this case

[opinion] On the contrary, KWh matters a lot, but the actual kwh is significantly less due to the fact that not all GPUs are compute-utilized all the time, even under heavy use.

PicoCreator · 2022-12-28T04:12:28Z

Awesome!, its good to have some validation that the idea actually makes more sense than crazy (my original assumption) - even if its just inference.

Any idea how big the gap is for training? (eg. is it 5x slower?)

For more accurate KWh though, we might need to have a more controlled tests, because of how napkin math the current numbers are taken from the spec.

Under full load, i find in general the M1 macbook pros would be below spec as the wattage typically accounts for thunderbolt/usb connectors. So the gap might be bigger than suggested.

(need to find confirmation) For the M1 max MBP, with the screen off, and no additional peripherals, I believe it is clocked to max out at 65w, which matches the typical USBC based power from a display+dock.

Next steps for me

Gonna give it a try on a mac studio, and mac mini so we can get datapoint from both extreme ends !!!

If there is any command after step 6? I can use to put a machine under the respective load, where I can try to get a more accurate in system wattage reading.

Though this would only apply for desktop macs. For laptops, it would need a wall meter (which i do not have) because the in system reading will switch back and forth between battery and wall power.

Notes:

And haha, yea agreed KWh matters - I was assuming (wrongly) that the lower-end macbook's might only be more useful in a lesson / training scenario - for students to have some hands on experience, using machines they have at hand in class, over actual production usage. (Due to the very limited memory size per node).

But i realise it is an assumption that needs validation. Especially on how the lower-end models are tuned for efficiency over performance.

PicoCreator · 2023-01-03T05:52:00Z

Unfortunately im stuck at the last step, as it seems to be still using cuda. (Scroll to end)

Using this space to log the whole macos setup step by step.
On a somewhat freshly formatted VM (so it should cover all missing steps).

Because the supported OS version required for M1 mac's defaults to ZSH, the whole process here assumes ZSH is used (and not bash)

Date this was done: 3rd Jan 2023
OS Version: 12.5.1
System: Mac Studio (2022), M1 Max, 32 GB

Setup conda environment with GPU support

For most parts, this is modified from https://towardsdatascience.com/installing-pytorch-on-apple-m1-chip-with-gpu-acceleration-3351dc44d67c , to work on a "clean install"

Ensure the OS is updated to 12.3++
Ensure xcode-select is installed
- xcode-select --install
Install homebrew
- /bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"
Install conda
- brew install anaconda
Validate conda is setup and installed
- conda --version
- If you get zsh: command not found: conda, you will need to export the path accordingly with
  - export PATH="/opt/homebrew/anaconda3/bin:$PATH"
- Alternatively you can add it to your .zshrc more permanently
  - echo '\nexport PATH="/opt/homebrew/anaconda3/bin:$PATH"\n' >> ~/.zshrc
Install the conda environment
- conda create -n torch-gpu python=3.9
- conda init zsh
Close and open a new terminal window (this ensure the items done with conda init is setup correctly)

Setup pytorch with GPU support

Activate the conda env with torch-gpu
- conda activate torch-gpu
Install pytorch stable (MacOS acceleration has been merged in, you no longer need nightly!)
- conda install pytorch torchvision torchaudio -c pytorch -y

Optional: Setup a folder for all your subsequent files

Setup a folder to store various stuff, away from your desktop/documents pile
- mkdir ./petals-macos; cd ./petals-macos

Optional: Validate the pytorch install using jupyter

Install jupyter notebook
- conda install -c conda-forge jupyter jupyterlab
Startup the jupyter notebook
- jupyter notebook
Create a new test notebook
Run the following script to check that it says "true" for both mps is avaliable and is built

import torch
import math
# this ensures that the current MacOS version is at least 12.3+
print(torch.backends.mps.is_available())
# this ensures that the current current PyTorch installation was built with MPS activated.
print(torch.backends.mps.is_built())

Install petals, and various other dependencies

install go 1.18 or newer https://go.dev/doc/install -- latest is fine
- check with: go version
- You will need to reopen the terminal after the go install, and rerun coda activate
- conda activate torch-gpu
install p2pd - check that it builds fine
- git clone https://github.com/learning-at-home/go-libp2p-daemon
- cd go-libp2p-daemon/p2pd
- go build .
- Validate that the p2pd binary is built ./p2pd --help
- Navigate back up cd ../..
Install petals
- pip install -U petals
Reinstall hivemine
- pip install --global-option="--buildgo" https://github.com/learning-at-home/hivemind/archive/master.zip

Run hive mind inside jupyter

PicoCreator · 2023-01-03T05:52:46Z

The full output text

===================================BUG REPORT===================================
Welcome to bitsandbytes. For bug reports, please submit your error trace to: https://github.com/TimDettmers/bitsandbytes/issues
For effortless bug reporting copy-paste your error into this form: https://docs.google.com/forms/d/e/1FAIpQLScPB8emS3Thkp66nvqwmjTEgxp8Y9ufuWTzFyr9kJ5AoI47dQ/viewform?usp=sf_link
================================================================================
CUDA_SETUP: WARNING! libcudart.so not found in any environmental path. Searching /usr/local/cuda/lib64...
WARNING: No libcudart.so found! Install CUDA or the cudatoolkit package (anaconda)!
CUDA SETUP: Loading binary /opt/homebrew/anaconda3/envs/torch-gpu/lib/python3.9/site-packages/bitsandbytes/libbitsandbytes_cpu.so...
---------------------------------------------------------------------------
OSError                                   Traceback (most recent call last)
Cell In[1], line 1
----> 1 import hivemind
      2 dht = hivemind.DHT(start=True)

File /opt/homebrew/anaconda3/envs/torch-gpu/lib/python3.9/site-packages/hivemind/__init__.py:1
----> 1 from hivemind.averaging import DecentralizedAverager
      2 from hivemind.compression import *
      3 from hivemind.dht import DHT

File /opt/homebrew/anaconda3/envs/torch-gpu/lib/python3.9/site-packages/hivemind/averaging/__init__.py:1
----> 1 from hivemind.averaging.averager import DecentralizedAverager

File /opt/homebrew/anaconda3/envs/torch-gpu/lib/python3.9/site-packages/hivemind/averaging/averager.py:20
     17 import numpy as np
     18 import torch
---> 20 from hivemind.averaging.allreduce import AllreduceException, AllReduceRunner, AveragingMode, GroupID
     21 from hivemind.averaging.control import AveragingStage, StepControl
     22 from hivemind.averaging.group_info import GroupInfo

File /opt/homebrew/anaconda3/envs/torch-gpu/lib/python3.9/site-packages/hivemind/averaging/allreduce.py:7
      3 from typing import AsyncIterator, Optional, Sequence, Set, Tuple, Type
      5 import torch
----> 7 from hivemind.averaging.partition import AllreduceException, BannedException, TensorPartContainer, TensorPartReducer
      8 from hivemind.compression import deserialize_torch_tensor, serialize_torch_tensor
      9 from hivemind.p2p import P2P, P2PContext, PeerID, ServicerBase, StubBase

File /opt/homebrew/anaconda3/envs/torch-gpu/lib/python3.9/site-packages/hivemind/averaging/partition.py:11
      8 import numpy as np
      9 import torch
---> 11 from hivemind.compression import CompressionBase, CompressionInfo, NoCompression
     12 from hivemind.proto import runtime_pb2
     13 from hivemind.utils import amap_in_executor, as_aiter, get_logger

File /opt/homebrew/anaconda3/envs/torch-gpu/lib/python3.9/site-packages/hivemind/compression/__init__.py:5
      1 """
      2 Compression strategies that reduce the network communication in .averaging, .optim and .moe
      3 """
----> 5 from hivemind.compression.adaptive import PerTensorCompression, RoleAdaptiveCompression, SizeAdaptiveCompression
      6 from hivemind.compression.base import CompressionBase, CompressionInfo, NoCompression, TensorRole
      7 from hivemind.compression.floating import Float16Compression, ScaledFloat16Compression

File /opt/homebrew/anaconda3/envs/torch-gpu/lib/python3.9/site-packages/hivemind/compression/adaptive.py:7
      4 import torch
      6 from hivemind.compression.base import CompressionBase, CompressionInfo, Key, NoCompression, TensorRole
----> 7 from hivemind.compression.serialization import deserialize_torch_tensor
      8 from hivemind.proto import runtime_pb2
     11 class AdaptiveCompressionBase(CompressionBase, ABC):

File /opt/homebrew/anaconda3/envs/torch-gpu/lib/python3.9/site-packages/hivemind/compression/serialization.py:9
      7 from hivemind.compression.base import CompressionBase, CompressionInfo, NoCompression
      8 from hivemind.compression.floating import Float16Compression, ScaledFloat16Compression
----> 9 from hivemind.compression.quantization import BlockwiseQuantization, Quantile8BitQuantization, Uniform8BitQuantization
     10 from hivemind.proto import runtime_pb2
     11 from hivemind.utils.streaming import combine_from_streaming

File /opt/homebrew/anaconda3/envs/torch-gpu/lib/python3.9/site-packages/hivemind/compression/quantization.py:14
     12 if importlib.util.find_spec("bitsandbytes") is not None:
     13     warnings.filterwarnings("ignore", module="bitsandbytes", category=UserWarning)
---> 14     from bitsandbytes.functional import quantize_blockwise, dequantize_blockwise
     16 from hivemind.compression.base import CompressionBase, CompressionInfo
     17 from hivemind.proto import runtime_pb2

File /opt/homebrew/anaconda3/envs/torch-gpu/lib/python3.9/site-packages/bitsandbytes/__init__.py:6
      1 # Copyright (c) Facebook, Inc. and its affiliates.
      2 #
      3 # This source code is licensed under the MIT license found in the
      4 # LICENSE file in the root directory of this source tree.
----> 6 from .autograd._functions import (
      7     MatmulLtState,
      8     bmm_cublas,
      9     matmul,
     10     matmul_cublas,
     11     mm_cublas,
     12 )
     13 from .cextension import COMPILED_WITH_CUDA
     14 from .nn import modules

File /opt/homebrew/anaconda3/envs/torch-gpu/lib/python3.9/site-packages/bitsandbytes/autograd/_functions.py:5
      2 import warnings
      4 import torch
----> 5 import bitsandbytes.functional as F
      7 from dataclasses import dataclass
      8 from functools import reduce  # Required in Python 3

File /opt/homebrew/anaconda3/envs/torch-gpu/lib/python3.9/site-packages/bitsandbytes/functional.py:13
     10 from typing import Tuple
     11 from torch import Tensor
---> 13 from .cextension import COMPILED_WITH_CUDA, lib
     14 from functools import reduce  # Required in Python 3
     16 # math.prod not compatible with python < 3.8

File /opt/homebrew/anaconda3/envs/torch-gpu/lib/python3.9/site-packages/bitsandbytes/cextension.py:41
     37             cls._instance.initialize()
     38         return cls._instance
---> 41 lib = CUDALibrary_Singleton.get_instance().lib
     42 try:
     43     lib.cadam32bit_g32

File /opt/homebrew/anaconda3/envs/torch-gpu/lib/python3.9/site-packages/bitsandbytes/cextension.py:37, in CUDALibrary_Singleton.get_instance(cls)
     35 if cls._instance is None:
     36     cls._instance = cls.__new__(cls)
---> 37     cls._instance.initialize()
     38 return cls._instance

File /opt/homebrew/anaconda3/envs/torch-gpu/lib/python3.9/site-packages/bitsandbytes/cextension.py:31, in CUDALibrary_Singleton.initialize(self)
     29 else:
     30     print(f"CUDA SETUP: Loading binary {binary_path}...")
---> 31     self.lib = ct.cdll.LoadLibrary(binary_path)

File /opt/homebrew/anaconda3/envs/torch-gpu/lib/python3.9/ctypes/__init__.py:460, in LibraryLoader.LoadLibrary(self, name)
    459 def LoadLibrary(self, name):
--> 460     return self._dlltype(name)

File /opt/homebrew/anaconda3/envs/torch-gpu/lib/python3.9/ctypes/__init__.py:382, in CDLL.__init__(self, name, mode, handle, use_errno, use_last_error, winmode)
    379 self._FuncPtr = _FuncPtr
    381 if handle is None:
--> 382     self._handle = _dlopen(self._name, mode)
    383 else:
    384     self._handle = handle

OSError: dlopen(/opt/homebrew/anaconda3/envs/torch-gpu/lib/python3.9/site-packages/bitsandbytes/libbitsandbytes_cpu.so, 0x0006): tried: '/opt/homebrew/anaconda3/envs/torch-gpu/lib/python3.9/site-packages/bitsandbytes/libbitsandbytes_cpu.so' (not a mach-o file)

borzunov · 2023-01-05T06:06:59Z

@PicoCreator The error caused by the bitsandbytes library that doesn't support macOS at the moment. This library is optional for hivemind and now Petals (once #180 is merged), so you can just:

Install the latest Petals code from the repo:

git clone https://github.com/bigscience-workshop/petals
cd petals
pip install --upgrade .

Uninstall bitsandbytes so that it doesn't cause errors: pip uninstall bitsandbytes

Note that the Petals server won't support storing weights in 8-bit (that's what the bitsandbytes library is for). Instead, it will store them in bfloat16, which takes ~1.9x more memory. If you need the 8-bit weights, you'd need to port bitsandbytes to macOS, which may be not trivial.

ineiti · 2023-01-13T15:12:03Z

I tried to follow the instructions here to get it to run on a non-M1 mac. The 'best' I manage to have is the following:

$ python -m petals.cli.run_server bigscience/bloom-petals
Jan 13 15:18:58.857 [INFO] Automatic dht prefix: bigscience/bloom-petals
Traceback (most recent call last):
  File "/Users/xxx/.pyenv/versions/3.10.6/lib/python3.10/runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/Users/xxx/.pyenv/versions/3.10.6/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/Users/xxx/.pyenv/versions/3.10.6/lib/python3.10/site-packages/petals/cli/run_server.py", line 213, in <module>
    main()
  File "/Users/xxx/.pyenv/versions/3.10.6/lib/python3.10/site-packages/petals/cli/run_server.py", line 196, in main
    server = Server(
  File "/Users/xxx/.pyenv/versions/3.10.6/lib/python3.10/site-packages/petals/server/server.py", line 121, in __init__
    self.dht = DHT(
  File "/Users/xxx/.pyenv/versions/3.10.6/lib/python3.10/site-packages/hivemind/dht/dht.py", line 88, in __init__
    self.run_in_background(await_ready=await_ready)
  File "/Users/xxx/.pyenv/versions/3.10.6/lib/python3.10/site-packages/hivemind/dht/dht.py", line 148, in run_in_background
    self.wait_until_ready(timeout)
  File "/Users/xxx/.pyenv/versions/3.10.6/lib/python3.10/site-packages/hivemind/dht/dht.py", line 151, in wait_until_ready
    self._ready.result(timeout=timeout)
  File "/Users/xxx/.pyenv/versions/3.10.6/lib/python3.10/site-packages/hivemind/utils/mpfuture.py", line 258, in result
    return super().result(timeout)
  File "/Users/xxx/.pyenv/versions/3.10.6/lib/python3.10/concurrent/futures/_base.py", line 458, in result
    return self.__get_result()
  File "/Users/xxx/.pyenv/versions/3.10.6/lib/python3.10/concurrent/futures/_base.py", line 403, in __get_result
    raise self._exception
hivemind.p2p.p2p_daemon_bindings.utils.P2PDaemonError: Daemon failed to start:     	Enables TLS1.3 channel security protocol (default true)

Now I have no idea why it tells me Enables TLS1.3 channel security protocol (default true) as an error message. I guess that the actual error message is longer, but I don't know how to retrieve the full error message. Does anyone have an idea?

As you can see, this is using pyenv with python version 3.10.6.

borzunov · 2023-01-13T15:39:34Z

@ineiti The message you see is a part of the p2pd's -help outputs. It is shown when the daemon encounters some unknown arguments. Thus, I suppose there is some kind of a version mismatch between hivemind and p2pd (it's likely that hivemind passes newer arguments not supported by this version of p2pd).

Could you please ensure that you use the latest commit in learning-at-home/go-libp2p-daemon, hivemind, and petals?

If it doesn't help, you can check out the full p2pd outputs by running the server like this:

HIVEMIND_LOGLEVEL=DEBUG GOLOG_LOG_LEVEL=DEBUG python -m petals.cli.run_server bigscience/bloom-petals 2>&1 | tee log.txt

There will be lots of debug outputs, but the daemon should report which arguments it doesn't understand somewhere among this text. If you can't find anything relevant, please send the log.txt file, I'll take a look.

ineiti · 2023-01-16T07:46:40Z

@borzunov OK, that works. Well, it doesn't, because I have an old macbook pro from 2018 with no Cuda support :(

Also, why isn't this done automatically if I run

pip install  --global-option="--buildgo" https://github.com/learning-at-home/hivemind/archive/master.zip

And where would I have found the information on how to build the correct p2pd? Or where should it be written?

justheuristic · 2023-01-17T17:20:14Z

I'm afraid, this information can only be found in the readme for that library, here

ineiti · 2023-01-17T20:18:28Z

In fact I did try

git clone https://github.com/learning-at-home/hivemind.git
cd hivemind
pip install . --global-option="--buildgo"

First on my Mac, but for some reasons this didn't work. The p2pd binary was still the original one. Or perhaps I had some caching of the package?

vrosca · 2023-02-15T15:17:34Z

@ineiti: don't know if this is still actual, I did manage to get it running on an older Intel MAC in the following way:

got Go 1.18.10 (versions <=1.17 and >=1.20 didn't manage to build) from https://go.dev/dl/
got go-libp2p-daemon version 0.3.16: https://github.com/learning-at-home/go-libp2p-daemon/archive/refs/tags/v0.3.16.tar.gz
built go-libp2p-daemon from source
overwrote p2pd in lib/python3.9/site-packages/hivemind/hivemind_cli/p2pd

I'm sure there's a more elegant way to do this but I'm not a Python guy so ...

ineiti · 2023-02-16T07:44:16Z

Waiting for my new mac and I'll try again...

Vectorrent · 2023-02-17T10:45:34Z

I was hoping to host an instance of chat.petals.ml on one of Oracle Cloud's ARM Ampere instances, but I am having no luck getting Petals to run. I asked for advice in the Discord server, and I was given a custom branch to test (of which removes the CPUFeatures module). After making some progress, I was pointed here - and I've spent several hours testing the recommendations, with varying degrees of success. Can anybody offer some additional advice?

Here is the Dockerfile I'm working with:

FROM arm64v8/debian:stable

RUN apt-get update \
    && DEBIAN_FRONTEND=noninteractive apt-get install -y --no-install-recommends \
    curl \
    git \
    python3-pip \
    gcc \
    g++ \
    python3-dev \
    tar \
    && rm -rf /var/lib/apt/lists/*

WORKDIR /tmp

RUN curl -fSL --output go.tar.gz https://go.dev/dl/go1.18.10.linux-arm64.tar.gz && \
    tar -xzf go.tar.gz -C /usr/local

ENV PATH="/usr/local/go/bin:${PATH}"

WORKDIR /app

## Copy files from https://github.com/borzunov/chat.petals.ml
COPY . ./

## Remove git+https://github.com/bigscience-workshop/petals from requirements.txt, since
## we install a custom branch of it in the next step
RUN pip install -r requirements.txt

RUN pip install --upgrade git+https://github.com/bigscience-workshop/petals@no-cpufeature

RUN git clone https://github.com/learning-at-home/hivemind

RUN cd hivemind && pip install . --global-option="--buildgo"

WORKDIR /p2pd

RUN git clone https://github.com/learning-at-home/go-libp2p-daemon && \
    cd go-libp2p-daemon/p2pd && \
    git checkout v0.3.16 && \
    go build . && \
    mv -f p2pd /usr/local/lib/python3.9/dist-packages/hivemind/hivemind_cli/p2pd && \
    chmod +x /usr/local/lib/python3.9/dist-packages/hivemind/hivemind_cli/p2pd

EXPOSE 5000

CMD ["gunicorn", "app:app", "--bind", "0.0.0.0:5000", "--threads", "100", "--timeout", "1000"]

This Dockerfile will build successfully on ARM. However, after running the container, you'll get the following error message:

app-app-1    | [2023-02-17 10:11:55 +0000] [1] [INFO] Starting gunicorn 20.1.0
app-app-1    | [2023-02-17 10:11:55 +0000] [1] [INFO] Listening at: http://0.0.0.0:5000 (1)
app-app-1    | [2023-02-17 10:11:55 +0000] [1] [INFO] Using worker: gthread
app-app-1    | [2023-02-17 10:11:55 +0000] [7] [INFO] Booting worker with pid: 7
app-app-1    | [2023-02-17 10:11:55 +0000] [7] [ERROR] Exception in worker process
app-app-1    | Traceback (most recent call last):
app-app-1    |   File "/usr/local/lib/python3.9/dist-packages/gunicorn/arbiter.py", line 589, in spawn_worker
app-app-1    |     worker.init_process()
app-app-1    |   File "/usr/local/lib/python3.9/dist-packages/gunicorn/workers/gthread.py", line 92, in init_process
app-app-1    |     super().init_process()
app-app-1    |   File "/usr/local/lib/python3.9/dist-packages/gunicorn/workers/base.py", line 134, in init_process
app-app-1    |     self.load_wsgi()
app-app-1    |   File "/usr/local/lib/python3.9/dist-packages/gunicorn/workers/base.py", line 146, in load_wsgi
app-app-1    |     self.wsgi = self.app.wsgi()
app-app-1    |   File "/usr/local/lib/python3.9/dist-packages/gunicorn/app/base.py", line 67, in wsgi
app-app-1    |     self.callable = self.load()
app-app-1    |   File "/usr/local/lib/python3.9/dist-packages/gunicorn/app/wsgiapp.py", line 58, in load
app-app-1    |     return self.load_wsgiapp()
app-app-1    |   File "/usr/local/lib/python3.9/dist-packages/gunicorn/app/wsgiapp.py", line 48, in load_wsgiapp
app-app-1    |     return util.import_app(self.app_uri)
app-app-1    |   File "/usr/local/lib/python3.9/dist-packages/gunicorn/util.py", line 359, in import_app
app-app-1    |     mod = importlib.import_module(module)
app-app-1    |   File "/usr/lib/python3.9/importlib/__init__.py", line 127, in import_module
app-app-1    |     return _bootstrap._gcd_import(name[level:], package, level)
app-app-1    |   File "<frozen importlib._bootstrap>", line 1030, in _gcd_import
app-app-1    |   File "<frozen importlib._bootstrap>", line 1007, in _find_and_load
app-app-1    |   File "<frozen importlib._bootstrap>", line 984, in _find_and_load_unlocked
app-app-1    | ModuleNotFoundError: No module named 'app'
app-app-1    | [2023-02-17 10:11:55 +0000] [7] [INFO] Worker exiting (pid: 7)
app-app-1    | [2023-02-17 10:11:55 +0000] [1] [INFO] Shutting down: Master
app-app-1    | [2023-02-17 10:11:55 +0000] [1] [INFO] Reason: Worker failed to boot.
app-app-1 exited with code 3

If you omit the custom p2pd build, which was recommended by @vrosca, you'll get a different error:

app-app-1    | Feb 17 09:22:52.413 [INFO] Loading tokenizer for bigscience/bloomz-petals
app-app-1    | Feb 17 09:22:52.977 [INFO] Loading model bigscience/bloomz-petals
app-app-1    | [2023-02-17 09:22:55 +0000] [7] [ERROR] Exception in worker process
app-app-1    | Traceback (most recent call last):
app-app-1    |   File "/usr/local/lib/python3.9/dist-packages/gunicorn/arbiter.py", line 589, in spawn_worker
app-app-1    |     worker.init_process()
app-app-1    |   File "/usr/local/lib/python3.9/dist-packages/gunicorn/workers/gthread.py", line 92, in init_process
app-app-1    |     super().init_process()
app-app-1    |   File "/usr/local/lib/python3.9/dist-packages/gunicorn/workers/base.py", line 134, in init_process
app-app-1    |     self.load_wsgi()
app-app-1    |   File "/usr/local/lib/python3.9/dist-packages/gunicorn/workers/base.py", line 146, in load_wsgi
app-app-1    |     self.wsgi = self.app.wsgi()
app-app-1    |   File "/usr/local/lib/python3.9/dist-packages/gunicorn/app/base.py", line 67, in wsgi
app-app-1    |     self.callable = self.load()
app-app-1    |   File "/usr/local/lib/python3.9/dist-packages/gunicorn/app/wsgiapp.py", line 58, in load
app-app-1    |     return self.load_wsgiapp()
app-app-1    |   File "/usr/local/lib/python3.9/dist-packages/gunicorn/app/wsgiapp.py", line 48, in load_wsgiapp
app-app-1    |     return util.import_app(self.app_uri)
app-app-1    |   File "/usr/local/lib/python3.9/dist-packages/gunicorn/util.py", line 359, in import_app
app-app-1    |     mod = importlib.import_module(module)
app-app-1    |   File "/usr/lib/python3.9/importlib/__init__.py", line 127, in import_module
app-app-1    |     return _bootstrap._gcd_import(name[level:], package, level)
app-app-1    |   File "<frozen importlib._bootstrap>", line 1030, in _gcd_import
app-app-1    |   File "<frozen importlib._bootstrap>", line 1007, in _find_and_load
app-app-1    |   File "<frozen importlib._bootstrap>", line 986, in _find_and_load_unlocked
app-app-1    |   File "<frozen importlib._bootstrap>", line 680, in _load_unlocked
app-app-1    |   File "<frozen importlib._bootstrap_external>", line 790, in exec_module
app-app-1    |   File "<frozen importlib._bootstrap>", line 228, in _call_with_frames_removed
app-app-1    |   File "/app/app.py", line 20, in <module>
app-app-1    |     model = DistributedBloomForCausalLM.from_pretrained(model_name, torch_dtype=config.TORCH_DTYPE, max_retries=3)
app-app-1    |   File "/usr/local/lib/python3.9/dist-packages/petals/client/remote_model.py", line 78, in from_pretrained
app-app-1    |     return super().from_pretrained(*args, low_cpu_mem_usage=low_cpu_mem_usage, **kwargs)
app-app-1    |   File "/usr/local/lib/python3.9/dist-packages/transformers/modeling_utils.py", line 2276, in from_pretrained
app-app-1    |     model = cls(config, *model_args, **model_kwargs)
app-app-1    |   File "/usr/local/lib/python3.9/dist-packages/petals/client/remote_model.py", line 237, in __init__
app-app-1    |     self.transformer = DistributedBloomModel(config)
app-app-1    |   File "/usr/local/lib/python3.9/dist-packages/petals/client/remote_model.py", line 107, in __init__
app-app-1    |     else hivemind.DHT(
app-app-1    |   File "/usr/local/lib/python3.9/dist-packages/hivemind/dht/dht.py", line 88, in __init__
app-app-1    |     self.run_in_background(await_ready=await_ready)
app-app-1    |   File "/usr/local/lib/python3.9/dist-packages/hivemind/dht/dht.py", line 148, in run_in_background
app-app-1    |     self.wait_until_ready(timeout)
app-app-1    |   File "/usr/local/lib/python3.9/dist-packages/hivemind/dht/dht.py", line 151, in wait_until_ready
app-app-1    |     self._ready.result(timeout=timeout)
app-app-1    |   File "/usr/local/lib/python3.9/dist-packages/hivemind/utils/mpfuture.py", line 262, in result
app-app-1    |     return super().result(timeout)
app-app-1    |   File "/usr/lib/python3.9/concurrent/futures/_base.py", line 440, in result
app-app-1    |     return self.__get_result()
app-app-1    |   File "/usr/lib/python3.9/concurrent/futures/_base.py", line 389, in __get_result
app-app-1    |     raise self._exception
app-app-1    | hivemind.p2p.p2p_daemon_bindings.utils.P2PDaemonError: Daemon failed to start: /usr/local/lib/python3.9/dist-packages/hivemind/hivemind_cli/p2pd: 4: Syntax error: Unterminated quoted string
app-app-1    | [2023-02-17 09:22:55 +0000] [7] [INFO] Worker exiting (pid: 7)
app-app-1    | [2023-02-17 09:22:55 +0000] [1] [INFO] Shutting down: Master
app-app-1    | [2023-02-17 09:22:55 +0000] [1] [INFO] Reason: Worker failed to boot.
app-app-1 exited with code 3

I'm going to keep working with this, and I'll post an update, if I make progress.

Any advice you can give would be greatly appreciated!

vrosca · 2023-02-17T10:56:18Z

Ok so, with the disclaimer that I'm terrible at Python and I only got this to work on an 2012 Intel MacBook Pro, here's what I did that might be helpful.

In lib/python3.9/site-packages/hivemind/p2p/p2p_daemon.py the arguments for p2pd are logged. In my case, on line 221, I changed the log level from debug to info:

221       **logger.info(f"Launching {proc_args}")**
222         self._child = await asyncio.subprocess.create_subprocess_exec(
223             *proc_args, stdout=asyncio.subprocess.PIPE, stderr=asyncio.subprocess.STDOUT, env=env
224         )
225         self._alive = True

You can then run the command from the console and see why it fails. That's how I got the final combination of Go version & go-libp2p-daemon that works for me.

Hope this helps

Vectorrent · 2023-02-17T12:24:08Z

Thanks for the advice, @vrosca. Unfortunately, it didn't help me, but during the process of troubleshooting, I learned that the issue was PEBKAC! Put simply, I forgot to switch the Dockerfile's working directory back to the app directory, just before trying to launch the webserver. This is what I needed to add to the above Dockerfile:

EXPOSE 5000

WORKDIR /app

CMD ["gunicorn", "app:app", "--bind", "0.0.0.0:5000", "--threads", "100", "--timeout", "1000"]

Thanks to everyone who spent time documenting their efforts, I now have a working installation on ARM Ampere!

borzunov · 2023-08-29T03:55:49Z

Hi @PicoCreator @ineiti @vrosca @LuciferianInk,

We've shipped native macOS support in #477 - both macOS clients and servers (including ones using Apple M1/M2 GPU) now work out of the box. You can try the latest version with:

pip install --upgrade git+https://github.com/bigscience-workshop/petals

Please ensure that you use Python 3.10+ (you can use Homebrew to install one: brew install python).

Please let me know if you meet any issues while installing or using it!

PicoCreator changed the title ~~Is it possibel to run on macos?~~ Is it possible to run on macos? Dec 11, 2022

justheuristic mentioned this issue Dec 31, 2022

Roadmap (tentative) #12

Open

32 tasks

borzunov mentioned this issue Jan 5, 2023

Import bitsandbytes only if it's going to be used #180

Merged

PicoCreator changed the title ~~Is it possible to run on macos?~~ Getting petals to run on macos Jan 5, 2023

borzunov changed the title ~~Getting petals to run on macos~~ Getting Petals to run on macOS Jan 13, 2023

borzunov added documentation Improvements or additions to documentation enhancement New feature or request labels Jan 13, 2023

This was referenced Aug 6, 2023

GGML + petals #439

Closed

Make client-side macOS support #449

Closed

borzunov closed this as completed Aug 29, 2023

MinaAlmasi mentioned this issue Nov 9, 2023

M1 macOS installation error ("failed to build wheel for hivemind") with mac-native petals #540

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Getting Petals to run on macOS #147

Getting Petals to run on macOS #147

PicoCreator commented Dec 11, 2022 •

edited

Loading

justheuristic commented Dec 21, 2022

PicoCreator commented Dec 22, 2022 •

edited

Loading

justheuristic commented Dec 26, 2022 •

edited

Loading

PicoCreator commented Dec 28, 2022 •

edited

Loading

PicoCreator commented Jan 3, 2023

PicoCreator commented Jan 3, 2023

borzunov commented Jan 5, 2023 •

edited

Loading

ineiti commented Jan 13, 2023

borzunov commented Jan 13, 2023 •

edited

Loading

ineiti commented Jan 16, 2023

justheuristic commented Jan 17, 2023

ineiti commented Jan 17, 2023

vrosca commented Feb 15, 2023

ineiti commented Feb 16, 2023

Vectorrent commented Feb 17, 2023

vrosca commented Feb 17, 2023 •

edited

Loading

Vectorrent commented Feb 17, 2023

borzunov commented Aug 29, 2023 •

edited

Loading

Getting Petals to run on macOS #147

Getting Petals to run on macOS #147

Comments

PicoCreator commented Dec 11, 2022 • edited Loading

justheuristic commented Dec 21, 2022

PicoCreator commented Dec 22, 2022 • edited Loading

justheuristic commented Dec 26, 2022 • edited Loading

Should you run on M1?

Can you run on M1?

Notes:

PicoCreator commented Dec 28, 2022 • edited Loading

PicoCreator commented Jan 3, 2023

Setup conda environment with GPU support

Setup pytorch with GPU support

Optional: Setup a folder for all your subsequent files

Optional: Validate the pytorch install using jupyter

Install petals, and various other dependencies

PicoCreator commented Jan 3, 2023

borzunov commented Jan 5, 2023 • edited Loading

ineiti commented Jan 13, 2023

borzunov commented Jan 13, 2023 • edited Loading

ineiti commented Jan 16, 2023

justheuristic commented Jan 17, 2023

ineiti commented Jan 17, 2023

vrosca commented Feb 15, 2023

ineiti commented Feb 16, 2023

Vectorrent commented Feb 17, 2023

vrosca commented Feb 17, 2023 • edited Loading

Vectorrent commented Feb 17, 2023

borzunov commented Aug 29, 2023 • edited Loading

PicoCreator commented Dec 11, 2022 •

edited

Loading

PicoCreator commented Dec 22, 2022 •

edited

Loading

justheuristic commented Dec 26, 2022 •

edited

Loading

PicoCreator commented Dec 28, 2022 •

edited

Loading

borzunov commented Jan 5, 2023 •

edited

Loading

borzunov commented Jan 13, 2023 •

edited

Loading

vrosca commented Feb 17, 2023 •

edited

Loading

borzunov commented Aug 29, 2023 •

edited

Loading