Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New default recorder #1161

Merged
merged 3 commits into from Jan 22, 2024
Merged

New default recorder #1161

merged 3 commits into from Jan 22, 2024

Conversation

laggui
Copy link
Member

@laggui laggui commented Jan 22, 2024

Checklist

  • Confirmed that run-checks all script has been executed.
  • Made sure the book is up to date with changes in this PR.

Related Issues

Closes #1151

Changes

After a bit of exploration and tests it was decided that we should revert to the NamedMpkFileRecorder (no compression) as the default. Any type of compression adds to much overhead and detracts the user experience.

  • Changed the DefaultFileRecorder
  • Updated the Burn book

tensors. The record system in Burn gives you the possibility to serialize any type, which is very
useful for optimizers that save their state, but also for any non-standard, cutting-edge modeling
needs you may have. Additionally, the record system performs automatic precision conversion by using
Rust types, making it more reliable with fewer manual manipulations.

It is important to note that the `safetensors` format uses the word *safe* to distinguish itself from Pickle, which is vulnerable to Python code injection. The safety comes from a checksum mechanism that guarantees that the data suffered no corruption. On our end, the simple fact that we use Rust already ensures that no code injection is possible. To prevent any data corruption, using a recorder with Gzip compression is recommended, as it includes a checksum mechanism.
It is important to note that the `safetensors` format uses the word *safe* to distinguish itself from Pickle, which is vulnerable to Python code injection. On our end, the simple fact that we use Rust already ensures that no code injection is possible. To prevent any data corruption, using a recorder with Gzip compression is recommended, as it includes a checksum mechanism.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if we should include this anymore: "To prevent any data corruption, using a recorder with Gzip compression is recommended, as it includes a checksum mechanism." Maybe updating it to: "If your storage mechanism doesn't handle data corruption, you may look into recorders that include a checksum, such as Gzip."

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahhh good point! Will fix.

@nathanielsimard nathanielsimard merged commit 191877f into main Jan 22, 2024
13 checks passed
@nathanielsimard nathanielsimard deleted the recorder/new-default branch January 22, 2024 17:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Improve the record type performance
2 participants