diff --git a/README.md b/README.md
index cb09ad8a..b4ea5638 100644
--- a/README.md
+++ b/README.md
@@ -48,7 +48,7 @@
- [Registering your wallet](#registering-your-wallet)
- [Running a Miner](#running-a-miner)
- [Running a Validator](#running-a-validator)
-- [New Releases](#new-releases)
+- [Releases](#releases)
- [Troubleshooting](#troubleshooting)
- [Troubleshooting Subtensor](#troubleshooting-subtensor)
- [License](#license)
@@ -372,28 +372,9 @@ pm2 start neurons/validator.py \
> NOTE: to access the wandb UI to get statistics about the miners, you can click on this [link](https://wandb.ai/eclipsevortext/subvortex-team) and choose the validator run you want.
-## New Releases
+## Releases
-When a new version of the subnet is released, each miner/validatior have to be updated.
-
-> Be sure you are in the SubVortex directory
-
-Get the lastest version of the subnet
-
-```
-git pull
-```
-
-Install the dependencies
-
-```
-pip install -r requirements.txt
-pip install -e .
-```
-
-Restart miners/validators if running them in your base environment or restart pm2 by executing `pm2 restart all` if you are using pm2 as process manager.
-
-> NOTE: to access the wandb UI to get statistics about the miners, you can click on this [link](https://wandb.ai/eclipsevortext/subvortex-team) and choose the validator run you want.
+- [Release-2.1.0](./scripts/release/release-2.1.0/RELEASE-2.1.0.md)
## Troubleshooting
diff --git a/scripts/redis/docs/redis-backup.md b/scripts/redis/docs/redis-backup.md
new file mode 100644
index 00000000..9f17bd5c
--- /dev/null
+++ b/scripts/redis/docs/redis-backup.md
@@ -0,0 +1,77 @@
+This guide provides step-by-step instructions for creating and restoring a dump in Redis. Redis is an open-source, in-memory data structure store used as a database, cache, and message broker. Dumps are a way to back up and restore data in Redis.
+
+
+
+Table of Contents
+---
+
+- [Create a dump](#create-a-dump)
+- [Restore a dump](#restore-a-dump)
+
+
+
+## Creating a Redis Dump
+
+To create a dump of your Redis database, follow these steps:
+
+1. **Connect to Redis**: Open a terminal or command prompt and connect to your Redis instance using the `redis-cli` command:
+
+ ```bash
+ redis-cli -a $(sudo grep -Po '^requirepass \K.*' /etc/redis/redis.conf)
+ ```
+
+2. **Create the Dump**: Use the `SAVE` command to create a dump of the current database. This command saves the dataset to a file called `dump.rdb` in the Redis data directory.
+
+ ```bash
+ SAVE
+ ```
+
+3. **Create the Dump**: Exit redis environment
+
+ ```bash
+ exit
+ ```
+
+4. **Make a copy**: Copy the dump file `dump.rdb` in `/var/lib/redis` and make a copy of it
+
+ ```bash
+ sudo cp /var/lib/redis/dump.rdb /var/lib/redis/dump.bak.rdb
+ ```
+
+5. **Verify the Dump**: Check that the copy of the dump file (`dump.bak.rdb`) has been created in `/var/lib/redis`.
+ ```bash
+ ls /var/lib/redis
+ ```
+
+## Restoring a Redis Dump
+
+To restore a dump in Redis, follow these steps:
+
+1. **Stop the Redis Server**: If Redis is running, stop the Redis server:
+
+ ```bash
+ sudo systemctl stop redis-server.service
+ ```
+
+2. **Replace the Dump File**: Replace the existing `dump.rdb` file in the Redis data directory with the dump file you want to restore.
+
+3. **Start the Redis Server**: Start the Redis server again.
+
+ ```bash
+ sudo systemctl start redis-server.service
+ ```
+
+4. **Verify the Restoration**: Connect to Redis using `redis-cli` and verify that the data has been restored correctly:
+
+ ```bash
+ redis-cli
+ KEYS *
+ ```
+
+ This command will display all keys in the database, confirming that the restoration was successful.
+
+## Additional Notes
+
+- It's important to ensure that Redis is stopped before replacing the dump file to avoid data corruption.
+
+For more information about Redis and its commands, refer to the [Redis Documentation](https://redis.io/documentation).
diff --git a/scripts/release/release-2.1.0/RELEASE-2.1.0.md b/scripts/release/release-2.1.0/RELEASE-2.1.0.md
new file mode 100644
index 00000000..bb913f13
--- /dev/null
+++ b/scripts/release/release-2.1.0/RELEASE-2.1.0.md
@@ -0,0 +1,179 @@
+This guide provides step-by-step instructions for the release 2.1.0.
+
+Previous Release: 2.0.0
+
+
+
+---
+
+- [Validator](#validators)
+ - [Rollout Process](#validator-rollout-process)
+ - [Rollback Process](#validator-rollback-process)
+- [Miner](#miner)
+ - [Rollout Process](#miner-rollout-process)
+ - [Rollback Process](#miner-rollback-process)
+- [Additional Resources](#additional-resources)
+
+---
+
+
+
+# Validator
+
+## Rollout Process
+
+1. **Backup Database**: Before starting the rollout process, backup your database using the [Backup Guide](../../redis/docs/redis-backup.md#create-a-dump).
+
+2. **Upgrade Subnet**: Check if you are on main or on a tag
+
+ ```bash
+ git branch -vvv
+ ```
+
+ You will see something similar
+
+ ```bash
+ # If you are on a tag
+ * (HEAD detached at v0.2.4) d6e233a Merge pull request #13 from eclipsevortex/release/0.2.4
+
+ # If you are on main
+ * main 13e555e [origin/main] Merge pull request #19 from eclipsevortex/release/2.0.0
+ ```
+
+ > IMPORTANT
+ > The \* tell you your active branch. It has to be hear on the tag on the main branch.
+
+ If you are on a tag branch, checkout main
+
+ ```bash
+ git checkout main
+ ```
+
+ Otherwise/Then, get the latest version of the subnet
+
+ ```bash
+ git pull
+ ```
+
+ Then, install the dependencies
+
+ ```bash
+ pip install -r requirements.txt
+ pip install -e .
+ ```
+
+3. **Restart validator**: Restart your validator to take the new version into the new version
+
+ ```bash
+ pm2 restart validator-92
+ ```
+
+4. **Check logs**: Check the validator logs to see if you see some `New Block`
+ ```bash
+ pm2 logs validator-92
+ ```
+
+
+
+## Rollback Process
+
+If any issues arise during or after the rollout, follow these steps to perform a rollback:
+
+1. **Rollback Database**: Rollback the database by running in **SubVortex** directory
+
+ ```bash
+ python3 ./scripts/migrations/release-2.1.0/migration.py --run-type rollback
+ ```
+
+ You should see
+
+ ```bash
+ 2024-03-29 22:08:27.867 | INFO | Loading database from localhost:6379
+ 2024-03-29 22:08:27.901 | INFO | Rollback starting
+ 2024-03-29 22:08:27.907 | INFO | Rollback done
+ 2024-03-29 22:08:27.908 | INFO | Checking rollback...
+ 2024-03-29 22:08:27.910 | INFO | Rollback checked successfully
+ ```
+
+ If any issue, restore your backup database using the [Backup Guide](../../migrations/backup.md#restore-a-dump).
+
+2. **Downgrade Subnet**: Get the tags
+
+ ```bash
+ git fetch --tags
+ ```
+
+ Check tag v2.0.0 exist
+
+ ```bash
+ git tag
+ ```
+
+ Checkout the tag
+
+ ```bash
+ git checkout tags/v2.0.0
+ ```
+
+ you will see
+
+ ```
+ Note: switching to 'tags/v0.2.4'.
+
+ You are in 'detached HEAD' state. You can look around, make experimental
+ changes and commit them, and you can discard any commits you make in this
+ state without impacting any branches by switching back to a branch.
+
+ If you want to create a new branch to retain commits you create, you may
+ do so (now or later) by using -c with the switch command. Example:
+
+ git switch -c
+
+ Or undo this operation with:
+
+ git switch -
+
+ Turn off this advice by setting config variable advice.detachedHead to false
+
+ HEAD is now at d6e233a Merge pull request #13 from eclipsevortex/release/0.2.4
+ ```
+
+ Then install the dependencies
+
+ ```bash
+ pip install -r requirements.txt
+ pip install -e .
+ ```
+
+3. **Restart validator**: Restart your validator to take the new version into the new version
+
+ ```bash
+ pm2 restart validator-92
+ ```
+
+4. **Check logs**: Check the validator logs to see if you see some `New Block`
+ ```bash
+ pm2 logs validator-92
+ ```
+
+
+
+# Miner
+
+## Rollout Process
+
+There is no rollout for miners.
+
+## Rollback Process
+
+There is no rollback for miners.
+
+
+
+# Additional Resources
+
+- [Backup Guide](../../redis/docs/redis-backup.md): Detailed instructions for backing up and restoring your database.
+
+
+
+For any further assistance or inquiries, please contact [**SubVortex Team**](https://discord.com/channels/799672011265015819/1215311984799653918)
diff --git a/scripts/release/release-2.1.0/migration.py b/scripts/release/release-2.1.0/migration.py
new file mode 100644
index 00000000..8c7a24d7
--- /dev/null
+++ b/scripts/release/release-2.1.0/migration.py
@@ -0,0 +1,93 @@
+import asyncio
+import argparse
+import bittensor as bt
+from redis import asyncio as aioredis
+
+from subnet.shared.utils import get_redis_password
+from subnet.shared.checks import check_environment
+
+
+def check_redis(args):
+ try:
+ asyncio.run(check_environment(args.redis_conf_path))
+ except AssertionError as e:
+ bt.logging.warning(
+ f"Something is missing in your environment: {e}. Please check your configuration, use the README for help, and try again."
+ )
+
+
+def rollout():
+ bt.logging.info("No rollout")
+
+
+async def rollback(args):
+ try:
+ bt.logging.info(
+ f"Loading database from {args.database_host}:{args.database_port}"
+ )
+ redis_password = get_redis_password(args.redis_password)
+ database = aioredis.StrictRedis(
+ host=args.database_host,
+ port=args.database_port,
+ db=args.database_index,
+ password=redis_password,
+ )
+
+ bt.logging.info("Rollback starting")
+ async for key in database.scan_iter("selection:*"):
+ await database.delete(key)
+ bt.logging.info("Rollback done")
+
+ bt.logging.info("Checking rollback...")
+ count = 0
+ async for key in database.scan_iter("selection:*"):
+ count += 1
+ if count == 0:
+ bt.logging.info("Rollback checked successfully")
+ else:
+ bt.logging.error(
+ f"Check rollback failed! You still have {count} keys to remove."
+ )
+
+ except Exception as e:
+ bt.logging.error(f"Error during rollback: {e}")
+
+
+async def main(args):
+ if args.run_type == "rollout":
+ rollout()
+ else:
+ await rollback(args)
+
+
+if __name__ == "__main__":
+ try:
+ parser = argparse.ArgumentParser()
+ parser.add_argument(
+ "--run-type",
+ type=str,
+ default="rollout",
+ help="Type of migration you want too execute. Possible values are rollout or rollback)",
+ )
+ parser.add_argument(
+ "--redis_password",
+ type=str,
+ default=None,
+ help="password for the redis database",
+ )
+ parser.add_argument(
+ "--redis_conf_path",
+ type=str,
+ default="/etc/redis/redis.conf",
+ help="path to the redis configuration file",
+ )
+ parser.add_argument("--database_host", type=str, default="localhost")
+ parser.add_argument("--database_port", type=int, default=6379)
+ parser.add_argument("--database_index", type=int, default=1)
+ args = parser.parse_args()
+
+ asyncio.run(main(args))
+ except KeyboardInterrupt:
+ print("KeyboardInterrupt")
+ except ValueError as e:
+ print(f"ValueError: {e}")
diff --git a/subnet/validator/challenge.py b/subnet/validator/challenge.py
index cc8cf376..97ba54ef 100644
--- a/subnet/validator/challenge.py
+++ b/subnet/validator/challenge.py
@@ -10,6 +10,7 @@
AVAILABILITY_FAILURE_REWARD,
LATENCY_FAILURE_REWARD,
DISTRIBUTION_FAILURE_REWARD,
+ RELIABILLITY_WEIGHT_FAILURE_REWARD,
AVAILABILITY_WEIGHT,
LATENCY_WEIGHT,
RELIABILLITY_WEIGHT,
@@ -17,7 +18,7 @@
)
from subnet.shared.subtensor import get_current_block
from subnet.validator.event import EventSchema
-from subnet.validator.utils import ping_and_retry_uids
+from subnet.validator.utils import ping_and_retry_uids, get_next_uids, ping_uid
from subnet.validator.localisation import get_country
from subnet.validator.bonding import update_statistics
from subnet.validator.state import log_event
@@ -30,6 +31,20 @@
CHALLENGE_NAME = "Challenge"
+DEFAULT_PROCESS_TIME = 5
+
+
+async def check_miner_availability(self, uid: int):
+ # Check the miner
+ availble = False
+
+ try:
+ # Ping the miner - miner and subtensor are unique so we consider a failure if one or the other is not reachable
+ availble = await ping_uid(self, uid)
+ except Exception:
+ availble = False
+
+ return availble
async def handle_synapse(self, uid: int):
@@ -40,6 +55,13 @@ async def handle_synapse(self, uid: int):
country = get_country(ip)
bt.logging.debug(f"[{CHALLENGE_NAME}][{uid}] Subtensor country {country}")
+ # Check miner is available
+ available = await check_miner_availability(self, uid)
+ if available == False:
+ bt.logging.warning(f"[{CHALLENGE_NAME}][{uid}] Miner is not reachable")
+ return available, country, DEFAULT_PROCESS_TIME
+
+ # Check the subtensor is available
process_time = None
try:
# Create a subtensor with the ip return by the synapse
@@ -70,10 +92,10 @@ async def handle_synapse(self, uid: int):
bt.logging.trace(
f"[{CHALLENGE_NAME}][{uid}] Verified ? {verified} - val: {validator_block}, miner:{miner_block}"
)
- except Exception as err:
+ except Exception:
verified = False
- process_time = 5 if process_time is None else process_time
- bt.logging.warning(f"[{CHALLENGE_NAME}][{uid}] Verified ? False")
+ process_time = DEFAULT_PROCESS_TIME if process_time is None else process_time
+ bt.logging.warning(f"[{CHALLENGE_NAME}][{uid}] Subtensor not verified")
return verified, country, process_time
@@ -100,7 +122,8 @@ async def challenge_data(self):
)
# Select the miners
- uids, _ = await ping_and_retry_uids(self, k=10)
+ validator_hotkey = self.metagraph.hotkeys[self.uid]
+ uids = await get_next_uids(self, validator_hotkey, k=10)
bt.logging.debug(f"[{CHALLENGE_NAME}] Available uids {uids}")
# Initialise the rewards object
@@ -121,6 +144,8 @@ async def challenge_data(self):
reliability_scores = []
distribution_scores = []
+ bt.logging.info(f"[{CHALLENGE_NAME}] Computing uids scores")
+
# Compute the score
for idx, (uid, (verified, country, process_time)) in enumerate(
zip(uids, responses)
@@ -172,8 +197,10 @@ async def challenge_data(self):
bt.logging.debug(f"[{CHALLENGE_NAME}][{uid}] Latency score {latency_score}")
# Compute score for reliability
- reliability_score = await compute_reliability_score(
- uid, self.database, hotkey
+ reliability_score = (
+ await compute_reliability_score(uid, self.database, hotkey)
+ if verified
+ else RELIABILLITY_WEIGHT_FAILURE_REWARD
)
reliability_scores.append(reliability_score)
bt.logging.debug(
@@ -183,7 +210,7 @@ async def challenge_data(self):
# Compute score for distribution
distribution_score = (
compute_distribution_score(idx, responses)
- if responses[idx][2] is not None
+ if verified and responses[idx][2] is not None
else DISTRIBUTION_FAILURE_REWARD
)
distribution_scores.append((uid, distribution_score))
@@ -254,7 +281,9 @@ async def challenge_data(self):
1 - alpha
) * self.moving_averaged_scores.to(self.device)
event.moving_averaged_scores = self.moving_averaged_scores.tolist()
- bt.logging.trace(f"[{CHALLENGE_NAME}] Updated moving avg scores: {self.moving_averaged_scores}")
+ bt.logging.trace(
+ f"[{CHALLENGE_NAME}] Updated moving avg scores: {self.moving_averaged_scores}"
+ )
# Display step time
forward_time = time.time() - start_time
diff --git a/subnet/validator/config.py b/subnet/validator/config.py
index 7101f07a..1d374b79 100644
--- a/subnet/validator/config.py
+++ b/subnet/validator/config.py
@@ -246,7 +246,7 @@ def add_args(cls, parser):
"--wandb.run_step_length",
type=int,
help="How many steps before we rollover to a new run.",
- default=360,
+ default=720,
)
parser.add_argument(
"--wandb.notes",
diff --git a/subnet/validator/score.py b/subnet/validator/score.py
index 53b4de91..89706a47 100644
--- a/subnet/validator/score.py
+++ b/subnet/validator/score.py
@@ -21,8 +21,12 @@ async def compute_reliability_score(uid, database, hotkey: str):
await database.hget(stats_key, "challenge_successes") or 0
)
challenge_attempts = int(await database.hget(stats_key, "challenge_attempts") or 0)
- bt.logging.trace(f"[{uid}][Score][Reliability] # challenge attempts {challenge_attempts}")
- bt.logging.trace(f"[{uid}][Score][Reliability] # challenge succeeded {challenge_successes}")
+ bt.logging.trace(
+ f"[{uid}][Score][Reliability] # challenge attempts {challenge_attempts}"
+ )
+ bt.logging.trace(
+ f"[{uid}][Score][Reliability] # challenge succeeded {challenge_successes}"
+ )
# Step 2: Normalization
normalized_score = wilson_score_interval(challenge_successes, challenge_attempts)
@@ -33,7 +37,9 @@ async def compute_reliability_score(uid, database, hotkey: str):
def compute_latency_score(idx, uid, validator_country, responses):
initial_process_times = [response[2] for response in responses]
bt.logging.trace(f"[{uid}][Score][Latency] Process times {initial_process_times}")
- bt.logging.trace(f"[{uid}][Score][Latency] Process time {initial_process_times[idx]}")
+ bt.logging.trace(
+ f"[{uid}][Score][Latency] Process time {initial_process_times[idx]}"
+ )
# Step 1: Get the localisation of the validator
validator_localisation = get_localisation(validator_country)
@@ -54,7 +60,7 @@ def compute_latency_score(idx, uid, validator_country, responses):
location["longitude"],
)
- scaled_distance = distance / MAX_DISTANCE
+ scaled_distance = distance / MAX_DISTANCE if distance > 0 else 0
tolerance = 1 - scaled_distance
process_time = process_time * tolerance if process_time else 5
@@ -84,7 +90,11 @@ def compute_latency_score(idx, uid, validator_country, responses):
score = relative_latency_scores[idx]
bt.logging.trace(f"[{uid}][Score][Latency] Relative score {score}")
- normalized_score = (score - min_score) / (max_score - min_score)
+ normalized_score = (
+ (score - min_score) / (max_score - min_score)
+ if max_score - min_score > 0
+ else 0
+ )
return normalized_score
diff --git a/subnet/validator/utils.py b/subnet/validator/utils.py
index ade23698..789d3e20 100644
--- a/subnet/validator/utils.py
+++ b/subnet/validator/utils.py
@@ -148,6 +148,26 @@ async def get_available_query_miners(
return get_pseudorandom_uids(self, muids, k=k)
+async def ping_uid(self, uid):
+ """
+ Ping a list of UIDs to check their availability.
+ Returns a tuple with a list of successful UIDs and a list of failed UIDs.
+ """
+ try:
+ response = await self.dendrite(
+ self.metagraph.axons[uid],
+ bt.Synapse(),
+ deserialize=False,
+ timeout=5,
+ )
+
+ return response.dendrite.status_code == 200
+ except Exception as e:
+ bt.logging.error(f"Dendrite ping failed: {e}")
+
+ return False
+
+
async def ping_uids(self, uids):
"""
Ping a list of UIDs to check their availability.
@@ -220,4 +240,52 @@ async def ping_and_retry_uids(
f"Insufficient successful UIDs for k: {k} Success UIDs {successful_uids} Failed UIDs: {failed_uids}"
)
- return list(successful_uids)[:k], failed_uids
\ No newline at end of file
+ return list(successful_uids)[:k], failed_uids
+
+
+async def get_next_uids(self, ss58_address: str, k: int = 4):
+ # Get the list of uids already selected
+ uids_already_selected = await get_selected_miners(self, ss58_address)
+ bt.logging.debug(f"get_next_uids() uids already selected: {uids_already_selected}")
+
+ # Get the list of available uids
+ uids = await get_available_query_miners(self, k=k, exclude=uids_already_selected)
+ bt.logging.debug(f"get_next_uids() uids:{uids}")
+
+ # Get the k uids requested
+ uids_selected = list(set(uids) - set(uids_already_selected))
+
+ # If no uids available we start again
+ if len(uids_selected) < k:
+ uids_already_selected = []
+
+ # Complete the selection with k - len(uids_selected) elements
+ # We always to have k miners selected
+ new_uids_selected = await get_available_query_miners(self, k=k)
+ uids_selected = uids_selected + new_uids_selected[:k - len(uids_selected)]
+
+ bt.logging.debug(f"get_next_uids() uids selected: {uids_selected}")
+
+ # Store the new selection in the database
+ selection_key = f"selection:{ss58_address}"
+ selection = ",".join(str(uid) for uid in uids_already_selected + uids_selected)
+ await self.database.set(selection_key, selection)
+ bt.logging.debug(f"get_next_uids() new uids selection stored: {selection}")
+
+ return uids_selected
+
+
+async def get_selected_miners(self, ss58_address: str):
+ selection_key = f"selection:{ss58_address}"
+
+ # Get the uids selection
+ value = await self.database.get(selection_key)
+ if value is None:
+ bt.logging.debug(f"get_selected_miners() no uids")
+ return []
+
+ # Get the uids already selected
+ uids_str = value.decode("utf-8") if isinstance(value, bytes) else value
+ uids = [int(uid) for uid in uids_str.split(",")]
+
+ return uids