All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.
1.0.0 - 2024-10-14
- The
Dependencyobject now takes an optional parameterbinary_dependenciesto specify binary packages to be installed in the computation container. (#249)
- cuda base docker image is now
nvidia/cuda:12.6.1-runtime-ubuntu24.04(#248) - Remove parasite verisons of
setuptoolsin Dockerfiles and installsetuptools>70.0.0to tackle last identified CVEs (#250)
- Bump NumPy and pytorch versions in tests. (#252)
- Drop Python 3.9 support. (#247)
0.47.0 - 2024-09-12
- Python 3.12 support (#226)
- Add Docker GPU base image, activated through the
Dependencyobject with the variableuse_gpu=True. The Docker image used isnvidia/cuda:11.8.0-runtime-ubuntu22.04. (#227)
- BREAKING: change
use_gputodiasble_gpuin allTorchAlgo. The device is set tocpuis no GPU are available or ifdisable_gpuis set toTrue. You must inverse the boolean in your code to keep the same behaviour (diasble_gpu == not use_gpu). (#241) - Remove packages named
build-essentialand*-devafter building dependencies to decrease CVE (#242)
-
Add a non-root user to the generated Dockerfile for the compute functions.
Compute pods were already running as non-root (ensured by a security context in the backend), we are making it more explicit here. (#228)
-
Added
subprocess_onlytag to prevent simulation mode tests to run in remote mode. (#229) -
Bump pytorch version to 2.2.1 in tests. (#230)
-
Bump NumPy version to 1.26.4 in tests. (#231)
-
Actually trigger the GPU docker configuration with
use_gpuflag when running Camelyon benchmark (#244) -
Use Tensor.cpu() to copy the tensor to host memory first in Camelyon benchmark (#245)
0.46.0 - 2024-06-03
- Add
apt updateto docker user images to limit vulnerabilities. (#213)
0.45.0 - 2024-03-27
-
- New CLI arguments to Camelyon benchmark (
--torch-gpuand--cp-name) (#201)
- New CLI arguments to Camelyon benchmark (
-
- Apply changes from breaking PR on Substra (#405(Substra/substra#405)) (#202)
-
- Depreciate
setup.pyin favour ofpyproject.toml(#204)
- Depreciate
0.44.0 - 2024-03-07
- Add documentation on how to change SubstraFL log level (#194)
- Add the
simulate_experimentfunction, that will execute theCompute Planin RAM only. It returns Python objects containing the computedPerformancesand the saved intermediateStates. More information about this feature is available in docstrings (#184).
Example of usage:
from substrafl.experiment import simulate_experiment
scores, intermediate_state_train, intermediate_state_agg = simulate_experiment(
client=my_substra_client,
strategy=my_strategy,
train_data_nodes=train_data_nodes,
evaluation_strategy=my_eval_strategy,
aggregation_node=aggregation_node,
clean_models=False,
num_rounds=NUM_ROUNDS,
)- BREAKING: rename
datasamplestodata_from_opener(#193) - Bump documentation dependencies to Sphinx 7.2.6 (#195)
- The predict task does not exist anymore. The evaluation of a model is done in a single task #177
Strategyimplement anevaluatemethod, with the@remote_datadecorator, to compute the evaluation of the model. Theevaluatemethod is the same for all strategies #177- BREAKING: the
perform_predictmethod ofStrategychanged in favor ofperform_evaluationthat calls the newevaluatemethod #177 - BREAKING:
metric_functionsare now passed to theStrategyinstead of theTestDataNode#177 - BREAKING: the
predictmethod ofAlgohas no@remote_datadecorator anymore. It signatures does not takeprediction_pathanymore, and the predictions are return by the method #177 - Abstract base class
Nodeis replaced byProtocols, defined insubstrafl.nodes.protocol.py(#185) - BREAKING: rename
test_data_sample_keys,test_tasksandregister_test_operations,taskstodata_sample_keysandregister_operationsinTestDataNodes(#185) - BREAKING:
InputIdentifiersandOutputIdentifiersmove fromsubstrafl.nodes.nodetosubstrafl.nodes.schemas(#185) - Switch to python-slim as base image, instead of substra-tools (#197)
- Dropped support for Python 3.8 (#200)
- Numerical stability of the
NewtonRaphsonstrategy is improved by symmetrizing the Hessian (#196)
0.43.0 - 2024-02-26
- Renamed
functionfield of Substra Function pydantic model toarchive(#181)
- Update schemas and tests to remove Pydantic v2 warnings (#183)
0.42.0 - 2023-10-18
- Support on Python 3.11 (#169)
- Remove substrafl wheel cache (#175)
- Camelyon benchmark download files (#182)
0.41.1 - 2023-10-06
- Fix Newton-Raphson docstring (#170)
0.41.0 - 2023-09-08
- Update to pydantic 2.3.0 (#159)
0.40.0 - 2023-09-07
-
Check the Python version used before generating the Dockerfile (#155).
-
Python dependencies can be resolved using pip compile during function registration by setting
compiletoTruein theDependencyobject (#155).Dependency( pypi_dependencies=["pytest", "numpy"], compile=True, )
-
Dependency objects are now computed at initialization in a cache directory, accessible through the
cache_directoryattribute. The cache directory is deleted at the Dependency object deletion. (#155) -
Check created wheels name. (#160)
- BREAKING: Rename
generate_wheel.pytomanage_dependencies.py(#156) - BREAKING: Move
manage_dependencies.pyfromremote.registertodependency(#158) - BREAKING:
local_dependenciesis renamedlocal_installable_dependencies(#158) - BREAKING: local_installable_dependencies are now limited to local modules or Python wheels (no support for bdist, sdist...) (#155).
- Set, save & load
random.seedandnp.random.seedalong withtorch.manual_seedinTorchAlgo(#151) - Keep the last round task output by default (#162)
0.39.0 - 2023-07-25
- BREAKING: Input and output of aggregate tasks are now
shared_state. It provides more flexibility to link different type of tasks with each other. To usedownload_aggregate_shared_stateon experiments launched before this commit, you can use the following code as a replacement of the function (#142).
import tempfile
from substrafl.model_loading import _download_task_output_files
from substrafl.model_loading import _load_from_files
with tempfile.TemporaryDirectory() as temp_folder:
_download_task_output_files(
client=<client>,
compute_plan_key=<compute_plan_key>,
dest_folder=temp_folder,
round_idx=<round_idx>,
rank_idx=<rank_idx>,
task_type="aggregate",
identifier="model",
)
aggregated_state = _load_from_files(input_folder=temp_folder, remote=True)- Function
waitinutils. You can usesubstra.Client.wait_task&substra.Client.wait_compute_planinstead. (#147)
- Compatibility with GPU devices when running torch based experiments (#154)
- Pin
pydanticto>=1.9.0&<2.0.0aspydanticv2.0.0has been released with a lot of non backward compatible changes. (#148)
0.38.0 - 2023-06-27
- BREAKING: Rename
model_loading.download_shared_statetomodel_loading.download_train_shared_state(#143) - BREAKING: Rename
model_loading.download_aggregated_statetomodel_loading.download_aggregate_shared_state(#143) - Numpy < 1.24 in dependencies to keep pickle compatibility with substra-tools numpy version (#144)
0.37.0 - 2023-06-12
- ComputePlanBuilder base class to define which method are needed to implement a custom strategy in SubstraFL.
These methods are
build_compute_plan,load_local_statesandsave_local_states. #120 - Check and test on string used as metric name in test data nodes (#122).
- Add default exclusion patterns when copying file to avoid creating large Docker images (#118)
- Add the possibility to force the Dependency editable_mode through the environment variable SUBSTRA_FORCE_EDITABLE_MODE (#131)
-
BREAKING: depreciate the usage of
model_loading.download_algo_filesandmodel_loading.load_algofunctions. New utils functions are now available. (#125)model_loading.download_algo_stateto download a SubstraFL algo of a given round or rank.model_loading.download_shared_stateto download a SubstraFL shared object of a given round or rank.model_loading.download_aggregated_stateto download a SubstraFL aggregated of a given round or rank. The API change goes from:algo_files_folder = str(pathlib.Path.cwd() / "tmp" / "algo_files") download_algo_files( client=client_to_download_from, compute_plan_key=compute_plan.key, round_idx=round_idx, dest_folder=algo_files_folder, ) model = load_algo(input_folder=algo_files_folder).model
to
algo = download_algo_state( client=client_to_download_from , compute_plan_key=compute_plan.key, round_idx=round_idx, ) model = algo.model
-
BREAKING: rename
build_graphtobuild_compute_plan. (#120) -
BREAKING: move
schema.pyinto thestrategymodule. (#120)from substrafl.schemas import FedAvgSharedState # Become from substrafl.strategies.schemas import FedAvgSharedState
-
Way to copy function files (#118)
-
download_train_task_models_by_rankuses new functionlist_task_output_assetsinstead of usingvaluethat has been removed (#129)
- New dependencies copy method in Docker mode.(#130)
0.36.0 - 2023-05-11
- Close issue #114. Large batch size are set to the number of samples in predict for NR and FedPCA. (#115)
-
BREAKING: Metrics are now given as
metric_functionsand not asmetric_key. The functions given as metric functions to test data nodes are automatically registered in a new Substra function by SubstraFL. (#117). The new argument of the TestDataNode classmetric_functionsreplaces themetric_keysone and accepts a dictionary (using the key as the identifier of the function given as value), a list of functions or directly a function if there is only one metric to compute (function.__name__is then used as identifier). Installed dependencies are thealgo_dependenciespassed toexecute_experiment, and permissions are the same as the predict function.From a user point of view, the metric registration changes from:
def accuracy(datasamples, predictions_path): y_true = datasamples["labels"] y_pred = np.load(predictions_path) return accuracy_score(y_true, np.argmax(y_pred, axis=1)) metric_deps = Dependency(pypi_dependencies=["numpy==1.23.1", "scikit-learn==1.1.1"]) permissions_metric = Permissions(public=False, authorized_ids=DATA_PROVIDER_ORGS_ID) metric_key = add_metric( client=client, metric_function=accuracy, permissions=permissions_metric, dependencies=metric_deps, ) test_data_nodes = [ TestDataNode( organization_id=org_id, data_manager_key=dataset_keys[org_id], test_data_sample_keys=[test_datasample_keys[org_id]], metric_keys=[metric_key], ) for org_id in DATA_PROVIDER_ORGS_ID ]
to:
def accuracy(datasamples, predictions_path): y_true = datasamples["labels"] y_pred = np.load(predictions_path) return accuracy_score(y_true, np.argmax(y_pred, axis=1)) test_data_nodes = [ TestDataNode( organization_id=org_id, data_manager_key=dataset_keys[org_id], test_data_sample_keys=[test_datasample_keys[org_id]], metric_functions={"Accuracy": accuracy}, ) for org_id in DATA_PROVIDER_ORGS_ID ]
-
Enforce kwargs for user facing function with more than 3 parameters (#109)
-
Remove references to
composite. Replace bytrain_task. (#108)
- Add the Federated Principal Component Analysis strategy (#97)
0.35.1 - 2023-04-11
- Change order of layers in the Dockerfile: files are copied as needed before the installation layers, and the final copy is made last. (#110)
0.35.0 - 2023-03-31
- Initialization task to each strategy in SubstraFL. (#89)
This allows to load the Algo and all its attributes to the platform before any training? Once on the platform, we can perform a testing task before any training.
This init task consists in submitting an empty function, coded in the BaseAlgoclass.
@remote
def initialize(self, shared_states):
returnThe init task return a local output that will be passed as input to a test task, and to the first train task.
The graph pass from:
flowchart LR
TrainTask1_round0--Local-->TestTask1_r0
TrainTask1_round0--Shared-->TestTask1_r0
TrainTask2_round0--Shared-->AggregateTask
TrainTask2_round0--Local-->TestTask2_r0
TrainTask2_round0--Shared-->TestTask2_r0
AggregateTask--Shared-->TrainTask1_r1
TrainTask1_round0--Local-->TrainTask1_r1
AggregateTask--Shared-->TrainTask2_r1
TrainTask2_round0--Local-->TrainTask2_r1
TrainTask1_round0--Shared-->AggregateTask
TrainTask1_r1--Local-->TestTask1_r1
TrainTask1_r1--Shared-->TestTask1_r1
TrainTask2_r1--Local-->TestTask2_r1
TrainTask2_r1--Shared-->TestTask2_r1
to:
flowchart LR
InitTask1_round0--Local-->TestTask1_r0
InitTask2_round0--Local-->TestTask2_r0
InitTask1_round0--Local-->TrainTask1_r1
InitTask2_round0--Local-->TrainTask2_r1
TrainTask2_r1--Shared-->AggregateTask
TrainTask1_r1--Shared-->AggregateTask
TrainTask1_r1--Local-->TestTask1_r1
TrainTask2_r1--Local-->TestTask2_r1
TrainTask1_r1--Local-->TrainTask1_r2
TrainTask2_r1--Local-->TrainTask2_r2
AggregateTask--Shared-->TrainTask1_r2
AggregateTask--Shared-->TrainTask2_r2
TrainTask1_r2--Local-->TestTask1_r2
TrainTask2_r2--Local-->TestTask2_r2
- BREAKING:
algoare now passed as parameter to thestrategyand not toexecute_experiementanymore (#98) - BREAKING A
strategyneed to implement a new methodbuild_graphto build the graph of tasks to be execute inexecute_experiment(#98) - BREAKING:
predictmethod ofstrategyhas been renamed toperform_predict(#98) - Test tasks don't take a
sharedas input anymore (#89) - BREAKING: change
eval_frequencydefault value to None to avoid confusion with hidden default value (#91) - BREAKING: rename Algo to Function (#82)
- BREAKING: clarify
EvaluationStrategyarguments: changeroundstoeval_frequencyandeval_rounds(#85) - replace
schemas.xxxbysubstra.schemas.xxx(#105)
- BREAKING: Given local code dependencies are now copied to the level of the running script systematically (#99)
- Docker images are pruned in main check of Github Action to free disk space while test run (#102)
- Pass
aggregation_lrto the parent class for Scaffold. Fix issue 103 (#104)
from substra import schemasinaggregation_node.py,test_data_node.pyandtrain_data_node.py(#105)
0.34.0 - 2023-02-20
- Possibility to test on an organization where no training has been performed (#74)
- Add contributing, contributors & code of conduct files (#68)
- Test only field for datasamples (#67)
- Remove RemoteDataMethod and change RemoteMethod class to be fully flexible regarding function name. The substra-tools methods is now generic, and load the inputs depending on the inputs dictionary content (#59)
- BREAKING: rename tuple to task (#79)
0.33.0 - 2022-12-19
- test: add Github Action to run subprocess tests on Windows after each merge (#60)
- test: pass the CI e2e tests on Python 3.10 (#56)
-
fix: bug introduced with numpy 1.24 and cloudpickle: TypeError: __generator_ctor(). Remove version from requirements. (Issue open)
-
fix: bug introduced with numpy 1.24 and cloudpickle: TypeError: __generator_ctor(). Remove version from requirements.
0.32.0 - 2022-11-22
-
The metric registration is simplified. The user can now directly write a score function within their script, and directly register it by specifying the right dependencies and permissions. The score function must have
(datasamples, predictions_path)as signature. (#47)Example of new metric registration:
metric_deps = Dependency(pypi_dependencies=["numpy==1.23.1"]) permissions_metric = Permissions(public=True) def mse(datasamples, predictions_path): y_true = datasamples["target"] y_pred = np.load(predictions_path) return np.mean((y_true - y_pred)**2) metric_key = add_metric( client=substra_client, permissions=permissions_metric, dependencies=metric_deps, metric_function=mse, )
-
doc on the model loading page (#40)
-
The round 0 is now exposed. Possibility to evaluate centralized strategies before any training (FedAvg, NR, Scaffold). The round 0 is skipped for single org strategy and cannot be evaluated before training (#46)
- Github actions on Ubuntu 22.04 (#52)
- torch algo: test that
with_batch_norm_parametersis only about the running mean and variance of the batch norm layers (#30) - torch algo:
with_batch_norm_parameters- also take into account thetorch.nn.LazyBatchNorm{x}dlayers (#30) - chore: use the generic task (#31)
- Apply changes from algo to function in substratools (#34)
- add
tools_functionsmethod toRemoteDataMethodandRemoteMethodto return the function(s) to send totools.execute.
- add
- Register functions in substratools using decorator
@tools.register(#37) - Update substratools Docker image (#49)
- Fix python 3.10 compatibility by catching OSError for Notebooks (#51)
- Free disk space in main github action to run the CI (#48)
- local dependencies are installed in one
pipcommand to optimize the installation and avoid incompatibilities error (#39) - Fix error when installing current package as local dependency (#41)
- Fix flake8 repo for pre-commit (#50)
0.31.0 - 2022-10-03
- algo category from algo as it is not required by substra anymore
- documentation of the
predictfunction of Algos was not up to date (#33)
0.30.0 - 2022-09-26
- Return statement of both
predictand_local_predictmethods from Torch Algorithms.
- Update the Client, it takes a backend type instead of debug=True + env variable to set the spawner - (#210)
- Do not use Model.category since this field is being removed from the SDK
- Update the tests and benchmark with the change on Metrics from substratools (#24)
- NOTABLE CHANGES due to breaking changes in substra-tools:
- the opener only exposes
get_dataandfake_datamethods - the results of the above method is passed under the
datasampleskeys within theinputsdict arg of all tools methods (train, predict, aggregate, score) - all method (train, predict, aggregate, score) now takes a
task_propertiesargument (dict) in addition toinputsandoutputs - The
rankof a task previously passed under therankkey within the inputs is now given in thetask_propertiesdict under therankkey
- the opener only exposes
This means that all opener.py file should be changed from:
import substratools as tools
class TestOpener(tools.Opener):
def get_X(self, folders):
...
def get_y(self, folders):
...
def fake_X(self, n_samples=None):
...
def fake_y(self, n_samples=None):
...to:
import substratools as tools
class TestOpener(tools.Opener):
def get_data(self, folders):
...
def fake_data(self, n_samples=None):
...This also implies that metrics has now access to the results of get_data
and not only get_y as previously. The user should adapt all of his metrics file accordingly e.g.:
class AUC(tools.Metrics):
def score(self, inputs, outputs):
"""AUC"""
y_true = inputs["y"]
...
def get_predictions(self, path):
return np.load(path)
if __name__ == "__main__":
tools.metrics.execute(AUC())could be replace with:
class AUC(tools.Metrics):
def score(self, inputs, outputs, task_properties):
"""AUC"""
datasamples = inputs["datasamples"]
y_true = ... # getting target from the whole datasamples
def get_predictions(self, path):
return np.load(path)
if __name__ == "__main__":
tools.metrics.execute(AUC())- BREAKING CHANGE:
trainandpredictmethod of all substrafl algos now takesdatasamplesas argument instead ofXabdy. This is impacting the user code only if he or she overwrite those methods instead of using the_local_trainand_local_predictmethods. - BREAKING CHANGE: The result of the
get_datamethod from the opener is automatically provided to the givendatasetas__init__arg instead ofxandywithin thetrainandpredictmethods of allTorch*Algoclasses. The userdatasetshould be adapted accordingly e.g.:
from torch.utils.data import Dataset
class MyDataset(Dataset):
def __init__(self, x, y, is_inference=False) -> None:
...
class MyAlgo(TorchFedAvgAlgo):
def __init__(
self,
):
torch.manual_seed(seed)
super().__init__(
model=my_model,
criterion=criterion,
optimizer=optimizer,
index_generator=index_generator,
dataset=MyDataset,
)should be replaced with
from torch.utils.data import Dataset
class MyDataset(Dataset):
def __init__(self, datasamples, is_inference=False) -> None:
...
class MyAlgo(TorchFedAvgAlgo):
def __init__(
self,
):
torch.manual_seed(seed)
super().__init__(
model=my_model,
criterion=criterion,
optimizer=optimizer,
index_generator=index_generator,
dataset=MyDataset,
)0.29.0 - 2022-09-19
- Use the new Substra SDK feature that enable setting the
transientflag on tasks instead ofclean_modelson compute plans to remove intermediary models.
0.28.0 - 2022-09-12
- Throw an error if
pytorch 1.12.0is used. There is a regression bug intorch 1.12.0, that impacts optimizers that have been pickled and unpickled. This bug occurs for Adam optimizer for example (but not for SGD). Here is a link to one issue covering it: pytorch/pytorch#80345
-
Removing
classic-algosfrom the benchmark dependencies -
NOTABLE CHANGES due to breaking changes in substra-tools: the user must now pass the method name to execute from the tools defined class within the dockerfile of both
algoandmetricunder the--method-nameargument:ENTRYPOINT ["python3", "metrics.py"]
shall be replaced by:
ENTRYPOINT ["python3", "metrics.py", "--method-name", "score"]
-
Use the new Susbtra sdk features that return the path of the downloaded file. Change the
model_loading.pyimplementation and the tests.
- In the PyTorch algorithms, move the data to the device (GPU or CPU) in the training loop and predict function so that the user does not need to do it.
- Rename connect-tools docker images to substra-tools
- Benchmark:
- use public data hosted on Zenodo for the benchmark
- Fix the GPU test to the last breaking changes, and unskip the
use_gpu=Falsecase
- Update the NpIndexGenerator docstrings to add information how to use it as a full epoch index generator.
- BREAKING CHANGES:
- an extra argument
predictions_pathhas been added to bothpredictand_local_predictmethods from all*TorchAgloclasses. The user now have to use the_save_predictionsmethod to save its predictions in_local_predict. The user defined metrics will load those saved prediction withnp.load(inputs['predictions']). The_save_predictionsmethod can be overwritten.
- an extra argument
Default _local_predict method from substrafl algorithms went from:
def _local_predict(self, predict_dataset: torch.utils.data.Dataset):
if self._index_generator is not None:
predict_loader = torch.utils.data.DataLoader(predict_dataset, batch_size=self._index_generator.batch_size)
else:
raise BatchSizeNotFoundError(
"No default batch size has been found to perform local prediction. "
"Please overwrite the _local_predict function of your algorithm."
)
self._model.eval()
predictions = torch.Tensor([])
with torch.inference_mode():
for x in predict_loader:
predictions = torch.cat((predictions, self._model(x)), 0)
return predictionsto
def _local_predict(self, predict_dataset: torch.utils.data.Dataset, predictions_path: Path):
if self._index_generator is not None:
predict_loader = torch.utils.data.DataLoader(predict_dataset, batch_size=self._index_generator.batch_size)
else:
raise BatchSizeNotFoundError(
"No default batch size has been found to perform local prediction. "
"Please overwrite the _local_predict function of your algorithm."
)
self._model.eval()
predictions = torch.Tensor([])
with torch.inference_mode():
for x in predict_loader:
predictions = torch.cat((predictions, self._model(x)), 0)
self._save_predictions(predictions, predictions_path)
return predictions- NOTABLE CHANGES due to breaking changes in connect-tools.
- both
load_predictionsandget_predictionsmethods have been removed from the opener - the user defined
metricsnow takesinputsandoutputsas argument.inputsis a dict containing:rank: inty: the result ofget_yapplied to the task datasamplespredictions: a file path where the output predictions of the user defined algo has been saved. As stated above, those predictions can be load thanks tonp.loadif the user didn't overwrite the_save_predictionsmethods from substrafl defined*Algo.
outputsis a dict containing:performance: a file path where to save the result of the metrics. It must be done through thetools.save_performancefunction.
- both
Instead of:
import substratools as tools
from sklearn.metrics import roc_auc_score
class AUC(tools.MetricAlgo):
def score(self, y_true, y_pred):
"""AUC"""
metric = roc_auc_score(y_true, y_pred) if len(set(y_true)) > 1 else 0
return float(metric)
if __name__ == "__main__":
tools.algo.execute(AUC())the metric files should look like:
import numpy as np
import substratools as tools
from sklearn.metrics import roc_auc_score
class AUC(tools.MetricAlgo):
def score(self, inputs, outputs):
"""AUC"""
y_pred = np.load(inputs["predictions"])
y_true = inputs["y"]
metric = roc_auc_score(y_true, y_pred) if len(set(y_true)) > 1 else 0
tools.save_performance(float(metric), outputs["performance"])
if __name__ == "__main__":
tools.algo.execute(AUC())- Documentation for the
_skipargument from the_local_predictand_local_trainmethods ofTorch*Algo.
- Update the inputs/outputs to make them compatible with the task execution
- GPU execution: move the RNG state to CPU in case the checkpoint has been loaded on the GPU
- fix: rng state for torch algos. Add test for both stability between organizations and rounds.
- feat:
_local_predicthas been re added - feat: add default batching to
predict - BREAKING CHANGE: drop Python 3.7 support
- BREAKING CHANGE: the library is now named "substrafl"
- feat: add compute task inputs
- fix: support several items in the
Dependency-local_dependenciesfield
- feat: add compute task output
- BREAKING CHANGE: add the torch Dataset as argument of TorchAlgo to preprocess the data
The
_init_function of the dataset must contain (self, x, y, is_inference). The__getitem__function is expected to return x, y if is_inference is False, else x. This behavior can be changed by re-writing the_local_trainorpredictmethods._local_trainis no longer mandatory to overwrite any more. Its signature passed from(x, y)to(train_dataset)_local_predicthas been deleted._get_len_from_xhas been deleted.
- feat: the compute plan tasks are uploaded to Connect using the auto-batching feature (it should solve gRPC message errors for large compute plans)
- BREAKING CHANGE: convert (test task) to (predict task + test task)
-
Added functions to download the model of a strategy :
-
The function
substrafl.model_loading.download_algo_filesdownloads the files needed to load the output model of a strategy according to the given round. These files are downloaded to the given folder. -
The
substrafl.model_loading.load_algofunction to load the output model of a strategy from the files previously downloaded via the the functionsubstrafl.model_loading.download_algo_files.
Those two functions works together:
download_algo_files(client=substra_client, compute_plan_key=key, round_idx=None, dest_folder=session_dir) model = load_algo(input_folder=session_dir)
-
- compatibility with substra 0.28.0
- feat: Newton Raphson strategy
- added packaging to the install requirements
- Stop using metrics APIs, use algo APIs instead
- BREAKING CHANGE: Strategy rounds starts at
1and initialization round is now0. It used to start at0and the initialization round was-1For each composite train tuple, aggregate tuple and test tuple the meta dataround_idxhas changed accordingly to the rule stated above. - BREAKING CHANGE: rename node to organization in Connect
- Rename the
OneNodestrategy toSingleOrganization
- when using the
TorchScaffoldAlgo:- The number of time the
_scaffold_parameters_updatemethod must be called within the_local_trainmethod is now checked - A warning is thrown if an other optimizer than
SGD - If multiple learning rates are set for the optimizer, a warning is thrown and the smallest learning rate is used for
the shared state aggregation operation.
0is not considered as a learning rate for this choice as it could be used to deactivate the learning process of certain layers from the model.
- The number of time the
- BREAKING CHANGE: add initialization round to centralized strategies :
- Each centralized strategy starts with an initialization round composed of one composite train tuple on each train data node
- One round of a centralized strategy is now:
Aggregation->Training on composite - Composite train tuples before test tuples have been removed
- All torch algorithm have now a common
predictmethod - The
algoargument has been removed from thepredictmethod of all strategies - The
fake_traintupleattribute of theRemoteStructclass has been removed
The full discussion regarding this feature can be found here
-
feat: meaningful name for algo . You can use the
_algo_nameparameter to set a custom algo name for the registration. By default, it is set tomethod-name_class-name.algo.train( node.data_sample_keys, shared_state=self.avg_shared_state, _algo_name=f"Training with {algo.__class__.__name__}", )
- chore: add latest connect-tools docker image selection
- Torch algorithms now support GPUs, there is a parameter
use_gpuin the__init__of the Torch algo classes. Ifuse_gpuis True and there is no GPU detected, the code runs on CPU.
- The wheels of the libraries installed with
editable=Trueare now in$HOME/.substraflinstead of$LIB_PATH/dist - benchmark:
make benchmarkruns the default remote benchmark on the connect platform specified in the config filemake benchmark-localruns the default local benchmark in subprocess mode
-
BREAKING CHANGE: replace "tag" argument with "name" in execute_experiment
-
execute_experimentchecks that the algo and strategy are compatible. You can override the list of strategies the algo is compatible with using thestrategiesproperty :from substrafl.algorithms.algo import Algo from substrafl import StrategyName class MyAlgo(Algo): @property def strategies(self): return [StrategyName.FEDERATED_AVERAGING, StrategyName.SCAFFOLD] # ...
- feat: the compute plan key of the experiment is saved in the experiment summary before submitting or executing it
- feat: add the possibility for the user to pass additional metadata to the compute plan metadata
- Force the reinstallation of connect-tools in the Docker image, necessary for the editable mode
-
BREAKING CHANGE: the default value of
drop_lastin theNpIndexGeneratoris now False -
BREAKING CHANGE: the index generator is now required when implementing a strategy
from substrafl.index_generator import NpIndexGenerator nig = NpIndexGenerator( batch_size=batch_size, num_updates=num_updates, drop_last=False, # optional, defaults to False shuffle=True, # optional, defaults to True ) class MyAlgo(TorchFedAvgAlgo): def __init__(self): super().__init__( index_generator=nig, # other parameters ) # ...
-
The user can now initialize his
TorchAlgofunction with custom parameters (only primitive types are supported) :class MyAlgo(TorchFedAvgAlgo): def __init__(self, my_arg): super.__init__( model=model, criterion=criterion optimizer=optimizer, index_generator=nig, my_arg=my_arg, # This is necessary ) # ...
- Fix the format of the asset ids: the right format is
str(uuid.uuid4())and notuuid.uuid4().hex
- feat: rename "compute_plan_tag" to "tag" #131
- feat: Add the optional argument "compute_plan_tag" to give the user the possibility to choose its own tag (timestamp by default) #128
- feat: Scaffold strategy
- feat: add one node strategy
- The Connect tasks have a
round_idxattribute in their metadata - doc: add python api to documentation
- API documentation: fix the docstrings and the display of the documentation for some functions
- (BREAKING CHANGE) FedAvg strategy: the train function must return a FedAvgSharedState, the average function returns a FedAvgAveragedState. No need to change your code if you use TorchFedAvgAlgo
- benchmark:
- Use the same batch sampler between the torch and Substrafl examples
- Make it work with
num_workers> 0 - Explain the effect of the sub-sampling
- Update the default benchmark parameters in
benchmarks.sh - Add new curves to the plotting: when one parameter changes while the others stay the same
- Use connect-tools 0.10.0 as a base image for the Dockerfile
- fix: naming changed from FedAVG to FedAvg
- fix: log a warning if an existing wheel is used to build the docker image
- fix:
execute_experimenthas no side effects on its arguments - fix:
Dependency.local_packageare installed in no editable mode and additionally acceptspyproject.yamlas configuration file - fix:
execute_experimentacceptsNoneasevaluation_strategy - fix: The
substrafl.algorithms.algo.Algoabstractmethoddecorator is now taken into account
- feat:
EvaluationStrategycan now be reinitialized - Refactoring
substrafl.algorithms.pytorch.fed_avg.TorchFedAvgAlgo:- replace the
_preprocessand_postprocessfunctions by_local_trainand_local_predict - the user can override the
_get_len_from_xfunction to get the number of samples in the dataset from x batch_sizeis now a required argument, and a warning is issued if it is None
- replace the
- The
substrafl.index_generator.np_index_generator.NpIndexGeneratorclass now works withtorch.utils.data.DataLoader, withnum_workers> 0 - The benchmark uses
substrafl.algorithms.pytorch.fed_avg.TorchFedAvgAlgoinstead of its own custom algorithm - Add the
clean_modelsoption to theexecute_experimentfunction
- feat: make a base class for the index generator and document it
- The
Algonow exposes amodelproperty to get the model after downloading it from Connect - (BREAKING CHANGE) experiment summary is saved as a json in
experiment_folder
- fix: notebook dependency failure You can now run a substrafl experiment with local dependencies in a Jupyter notebook
-
feat: models can now be tested every n rounds, on the same nodes they were trained on This feature introduces a new parameter
evaluation_strategyinexecute_experiment, which takes anEvaluationStrategyinstance fromsubstrafl.evaluation_strategy. If this parameter is not given, performance will not be measured at all (previously, it was measured at the end of the experiment by default). -
feat: install substrafl from pypi
- fix: Update pydantic version to enable autocompletion
- feat: Add a FL algorithm wrapper in PyTorch for the federated averaging strategy
- test: connect-test integration
- feat: Add a possibility to test an algorithm on selected rounds or every n rounds
- fix: dependency management: the
local_codedependencies are copied to the same folder structure relatively to the algo - fix: dependency management - it failed when resolving the
local_codedependencies because the path to the algo was relative
- feat: batch indexer
- feat: more logs + function to set the logging level
- Subprocess mode is now faster as it fully reuses the user environment instead of re building the connect related parts (substra #119 and #63)
- fix: error message for local dependency
- feat: User custom dependencies
- feat: support substra subprocess mode
- first release