Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
80 commits
Select commit Hold shift + click to select a range
9b33454
clean up legacy prompting synapse
p-ferreira May 16, 2024
5d3e1b9
initial adjustments to protocol + chunks logging
p-ferreira May 16, 2024
b6b7e22
fix chunk processing + mock
p-ferreira May 16, 2024
f3e5466
adjust dendrite response
p-ferreira May 16, 2024
36344f0
refactors base pipeline apply function
p-ferreira May 16, 2024
b269365
adds tokenizer to flow + dendrite refactor
p-ferreira May 17, 2024
7227814
implements streaming reward function
p-ferreira May 17, 2024
dfe2de7
Merge pull request #225 from opentensor/main
p-ferreira May 17, 2024
a47bdf1
adds tokens_per_chunk in forward flow + adds global penalties
p-ferreira May 17, 2024
c5d9d06
Increase default max memory to 80 GB
dbobrenko May 18, 2024
02cd837
Change example run to 1 GPU
dbobrenko May 18, 2024
107a4d6
Fix validator hotkeys IndexError
dbobrenko May 19, 2024
807f162
Revert vllm changes
dbobrenko May 19, 2024
77c913e
Remove redundant validator code
dbobrenko May 19, 2024
8c2cfce
Fix comment style
dbobrenko May 19, 2024
a908392
Restore validator changes
dbobrenko May 19, 2024
6c9eed9
Restore config.py
dbobrenko May 19, 2024
44ec400
Remove clean function
dbobrenko May 19, 2024
07ff31a
Update comments
dbobrenko May 19, 2024
154a5ef
Update miner requirements, increase minumum storage
dbobrenko May 20, 2024
919780d
drops synapse final data out of accumulated chunks
p-ferreira May 20, 2024
28fe4bc
Set maximum GPU mem to 65
dbobrenko May 21, 2024
43862f6
Set maximum GPU mem to 62
dbobrenko May 21, 2024
5337dde
Decrease min memory to 62
dbobrenko May 21, 2024
dcb891a
Merge pull request #236 from dbobrenko/hotfix/vllm-optimize
p-ferreira May 21, 2024
afdf322
Merge pull request #237 from dbobrenko/feature/llama-requirements
p-ferreira May 21, 2024
a737825
Merge branch 'main' into features/stream-adjustments
p-ferreira May 21, 2024
4b862e7
adds unit tests for streaming reward model
p-ferreira May 21, 2024
da4a066
Properly update removed or replaced hotkeys
dbobrenko May 21, 2024
4d292f5
Set scores device
dbobrenko May 21, 2024
aa61f86
Optimize device setting
dbobrenko May 21, 2024
2dd0174
Merge with staging
dbobrenko May 22, 2024
0b8dc10
Revert vllm_llm to staging
dbobrenko May 22, 2024
b4fd266
Update README with 62 GB VRAM
dbobrenko May 22, 2024
6c85189
Change readme description to Llama3
dbobrenko May 22, 2024
9d67963
adjust reward calculation to include global penalties
p-ferreira May 22, 2024
369166d
Revert validator changes
dbobrenko May 22, 2024
03e37bd
Update miner deps
dbobrenko May 23, 2024
ffeb58d
Fix install script
dbobrenko May 23, 2024
c8ef7a4
Edit install.sh comments
dbobrenko May 23, 2024
34b52f8
Simplify setup
dbobrenko May 23, 2024
241437d
Add load_in_4bits flag to miner
dbobrenko May 23, 2024
921e858
Exit on CUDA OOM
dbobrenko May 28, 2024
854b7ad
remove unit test todo comment
p-ferreira May 28, 2024
7bbb698
Merge branch 'main' into staging
p-ferreira May 28, 2024
590a7c5
Merge branch 'staging' into features/stream-adjustments
p-ferreira May 28, 2024
d1b73eb
updates versioning
p-ferreira May 28, 2024
3808f4e
drops deprecated prompting synapse from mock code
p-ferreira May 28, 2024
bf61e97
fix mock pipeline
p-ferreira May 28, 2024
68b9a28
Update README.md
bkb2135 May 29, 2024
0a78c48
Update prompting/forward.py
dbobrenko May 30, 2024
dacfc6a
Address Pedro's comments
dbobrenko May 30, 2024
6840d1f
Bittensor upgrade to 7.0.0
dbobrenko May 31, 2024
e62c99f
Merge with staging
dbobrenko May 31, 2024
425ab9b
Add bit about discord
bkb2135 Jun 3, 2024
86bdbbb
Update README.md
bkb2135 Jun 3, 2024
596e687
Upgrade all requirements for vllm 0.4.2, specify versions
dbobrenko Jun 3, 2024
4306237
Merge pull request #247 from macrocosm-os/main
bkb2135 Jun 3, 2024
ff0de98
fix stream synpase import
p-ferreira Jun 4, 2024
1d3e6bc
fix mock dendrite call
p-ferreira Jun 4, 2024
e88e694
Sample all available uids
bkb2135 Jun 4, 2024
ee72240
Do not exclude uids that were just queried
bkb2135 Jun 4, 2024
db3ef35
Merge pull request #243 from macrocosm-os/hotfix/oom-repeated-tasks
p-ferreira Jun 5, 2024
1be9848
Merge pull request #235 from dbobrenko/hotfix/hotkeys-index-error
p-ferreira Jun 5, 2024
7925dcd
Merge branch 'staging' into features/discourage-base-miners
p-ferreira Jun 5, 2024
51e11dd
Merge branch 'staging' into feature/bittensor-7.0.0
p-ferreira Jun 5, 2024
e77eeef
Merge pull request #244 from macrocosm-os/features/discourage-base-mi…
p-ferreira Jun 5, 2024
dfb1c76
Merge branch 'staging' into features/stream-adjustments
p-ferreira Jun 5, 2024
4e05207
Merge pull request #246 from macrocosm-os/feature/bittensor-7.0.0
p-ferreira Jun 5, 2024
1961f6c
updates versioning
p-ferreira Jun 5, 2024
41d1c67
Merge branch 'staging' into features/stream-adjustments
p-ferreira Jun 5, 2024
54932c7
Update config.py
bkb2135 Jun 5, 2024
ec2b764
Update forward.py
steffencruz Jun 5, 2024
b1cbef3
Merge pull request #251 from macrocosm-os/hotfix/reduce-conversation-…
p-ferreira Jun 5, 2024
2aa77c8
Merge pull request #249 from macrocosm-os/feature/remove-random-uid-s…
p-ferreira Jun 5, 2024
834d70c
Merge pull request #234 from macrocosm-os/features/stream-adjustments
p-ferreira Jun 5, 2024
8d38f12
update bittensor requirements
p-ferreira Jun 5, 2024
794d3f7
Merge pull request #252 from macrocosm-os/hotfix/bittensor-package-up…
p-ferreira Jun 5, 2024
ea0843d
fix tokenizer issue, fix logging issue, adapt mock miner for unit test
p-ferreira Jun 5, 2024
29002ea
Merge pull request #253 from macrocosm-os/hotfix/tokenizer-issue
p-ferreira Jun 5, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 10 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -47,9 +47,12 @@ bash install.sh

# Compute Requirements

1. To run a **validator**, you will need at least 24GB of VRAM.
2. To run the default huggingface **miner**, you will need at least 18GB of VRAM.
1. To run a **validator**, you will need at least 62GB of VRAM.
2. To run the default huggingface **miner**, you will need at least 62GB of VRAM.


**It is important to note that the baseminers are not recommended for main, and exist purely as an example. Running a base miner on main will result in no emissions and a loss in your registration fee.**
If you have any questions please reach out in the SN1 channel in the Bittensor Discord.
</div>

# How to Run
Expand Down Expand Up @@ -77,10 +80,11 @@ For ease of use, you can run the scripts as well with PM2. Installation of PM2 i
sudo apt update && sudo apt install jq && sudo apt install npm && sudo npm install pm2 -g && pm2 update
```

Example of running a SOLAR miner:
Example of running a Llama3 miner:

```bash
pm2 start neurons/miners/huggingface/miner.py --interpreter python3 --name solar_miner -- --netuid 1 --subtensor.network finney --wallet.name my_wallet --wallet.hotkey m1 --neuron.model_id casperhansen/llama-3-70b-instruct-awq --axon.port 21988 --logging.debug
```
pm2 start neurons/miners/huggingface/miner.py --interpreter python3 --name llama3_miner -- --netuid 1 --subtensor.network finney --wallet.name my_wallet --wallet.hotkey m1 --neuron.model_id casperhansen/llama-3-70b-instruct-awq --neuron.load_in_4bit True --axon.port 21988 --logging.debug
```

# Testnet
We highly recommend that you run your miners on testnet before deploying on main. This is give you an opportunity to debug your systems, and ensure that you will not lose valuable immunity time. The SN1 testnet is **netuid 61**.
Expand All @@ -90,7 +94,7 @@ In order to run on testnet, you will need to go through the same hotkey registra
To run:

```bash
pm2 start neurons/miners/huggingface/miner.py --interpreter python3 --name solar_miner -- --netuid 61 --subtensor.network test --wallet.name my_test_wallet --wallet.hotkey m1 --neuron.model_id casperhansen/llama-3-70b-instruct-awq --axon.port 21988 --logging.debug
pm2 start neurons/miners/huggingface/miner.py --interpreter python3 --name llama3_miner -- --netuid 61 --subtensor.network test --wallet.name my_test_wallet --wallet.hotkey m1 --neuron.model_id casperhansen/llama-3-70b-instruct-awq --neuron.load_in_4bit True --axon.port 21988 --logging.debug
```

# Limitations
Expand Down
12 changes: 6 additions & 6 deletions min_compute.yml
Original file line number Diff line number Diff line change
Expand Up @@ -22,12 +22,12 @@ compute_spec:

gpu:
required: True # Does the application require a GPU?
min_vram: 20 # Minimum GPU VRAM (GB)
recommended_vram: 24 # Recommended GPU VRAM (GB)
min_vram: 62 # Minimum GPU VRAM (GB)
recommended_vram: 80 # Recommended GPU VRAM (GB)
cuda_cores: 1024 # Minimum number of CUDA cores (if applicable)
min_compute_capability: 6.0 # Minimum CUDA compute capability
recommended_compute_capability: 7.0 # Recommended CUDA compute capability
recommended_gpu: "NVIDIA A10" # Recommended GPU to purchase/rent
recommended_gpu: "NVIDIA A100" # Recommended GPU to purchase/rent

memory:
min_ram: 16 # Minimum RAM (GB)
Expand All @@ -36,7 +36,7 @@ compute_spec:
ram_type: "DDR4" # RAM type (e.g., DDR4, DDR3, etc.)

storage:
min_space: 24 # Minimum free storage space (GB)
min_space: 60 # Minimum free storage space (GB)
recommended_space: 100 # Recommended free storage space (GB)
type: "SSD" # Preferred storage type (e.g., SSD, HDD)
min_iops: 1000 # Minimum I/O operations per second (if applicable)
Expand All @@ -57,7 +57,7 @@ compute_spec:

gpu:
required: True # Does the application require a GPU?
min_vram: 80 # Minimum GPU VRAM (GB)
min_vram: 62 # Minimum GPU VRAM (GB)
recommended_vram: 80 # Recommended GPU VRAM (GB)
cuda_cores: 1024 # Minimum number of CUDA cores (if applicable)
min_compute_capability: 6.0 # Minimum CUDA compute capability
Expand All @@ -71,7 +71,7 @@ compute_spec:
ram_type: "DDR4" # RAM type (e.g., DDR4, DDR3, etc.)

storage:
min_space: 40 # Minimum free storage space (GB)
min_space: 60 # Minimum free storage space (GB)
recommended_space: 100 # Recommended free storage space (GB)
type: "SSD" # Preferred storage type (e.g., SSD, HDD)
min_iops: 1000 # Minimum I/O operations per second (if applicable)
Expand Down
2 changes: 1 addition & 1 deletion prompting/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@
# DEALINGS IN THE SOFTWARE.

# Define the version of the template module.
__version__ = "2.3.1"
__version__ = "2.4.0"
version_split = __version__.split(".")
__spec_version__ = (
(10000 * int(version_split[0]))
Expand Down
63 changes: 41 additions & 22 deletions prompting/dendrite.py
Original file line number Diff line number Diff line change
@@ -1,45 +1,59 @@
import torch
import bittensor as bt
from typing import List
from dataclasses import dataclass
from prompting.protocol import StreamPromptingSynapse
from prompting.utils.misc import serialize_exception_to_string


@dataclass
class SynapseStreamResult:
exception: BaseException = None
uid: int = None
accumulated_chunks: List[str] = None
accumulated_chunks_timings: List[float] = None
tokens_per_chunk: List[int] = None
synapse: StreamPromptingSynapse = None


class DendriteResponseEvent:
def __init__(
self, responses: List[bt.Synapse], uids: torch.LongTensor, timeout: float
self, stream_results: SynapseStreamResult, uids: torch.LongTensor, timeout: float
):
self.uids = uids
self.completions = []
self.status_messages = []
self.status_codes = []
self.timings = []
self.stream_results_uids = []
self.stream_results_exceptions = []
self.stream_results_all_chunks = []
self.stream_results_all_chunks_timings = []
self.stream_results_all_tokens_per_chunk = []

for stream_result in stream_results:
synapse = stream_result.synapse

for synapse in responses:
self.completions.append(synapse.completion)
self.status_messages.append(synapse.dendrite.status_message)
status_code = synapse.dendrite.status_code

if len(synapse.completion) == 0 and synapse.dendrite.status_code == 200:
synapse.dendrite.status_code = 204
if len(synapse.completion) == 0 and status_code == 200:
status_code = 204

self.status_codes.append(synapse.dendrite.status_code)

if (synapse.dendrite.process_time) and (
synapse.dendrite.status_code == 200
or synapse.dendrite.status_code == 204
):
self.timings.append(synapse.dendrite.process_time)
elif synapse.dendrite.status_code == 408:
self.status_codes.append(status_code)
process_time = synapse.dendrite.process_time or 0
if status_code == 200 or status_code == 204:
self.timings.append(process_time)
elif status_code == 408:
self.timings.append(timeout)
else:
self.timings.append(0) # situation where miner is not alive
self.timings.append(0)

self.completions = [synapse.completion for synapse in responses]
self.timings = [
synapse.dendrite.process_time or timeout for synapse in responses
]
self.status_messages = [
synapse.dendrite.status_message for synapse in responses
]
self.status_codes = [synapse.dendrite.status_code for synapse in responses]
self.stream_results_uids.append(stream_result.uid)
self.stream_results_exceptions.append(serialize_exception_to_string(stream_result.exception))
self.stream_results_all_chunks.append(stream_result.accumulated_chunks)
self.stream_results_all_chunks_timings.append(stream_result.accumulated_chunks_timings)
self.stream_results_all_tokens_per_chunk.append(stream_result.tokens_per_chunk)

def __state_dict__(self):
return {
Expand All @@ -48,6 +62,11 @@ def __state_dict__(self):
"timings": self.timings,
"status_messages": self.status_messages,
"status_codes": self.status_codes,
"stream_results_uids": self.stream_results_uids,
"stream_results_exceptions": self.stream_results_exceptions,
"stream_results_all_chunks": self.stream_results_all_chunks,
"stream_results_all_chunks_timings": self.stream_results_all_chunks_timings,
"stream_results_all_tokens_per_chunk": self.stream_results_all_tokens_per_chunk,
}

def __repr__(self):
Expand Down
Loading