Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Harden pickle.load() against deserialization attacks #6

Conversation

pixeebot[bot]
Copy link

@pixeebot pixeebot bot commented Aug 2, 2024

Python's pickle module is notoriouly insecure. While it is very useful for serializing and deserializing Python objects, it is not safe to use pickle to load data from untrusted sources. This is because pickle can execute arbitrary code when loading data. This can be exploited by an attacker to execute arbitrary code on your system. Unlike yaml there is no concept of a "safe" loader in pickle. Therefore, it is recommended to avoid pickle and to use a different serialization format such as json or yaml when working with untrusted data.

However, if you must use pickle to load data from an untrusted source, we recommend using the open-source fickling library. fickling is a drop-in replacement for pickle that validates the data before loading it and checks for the possibility of code execution. This makes it much safer (although still not entirely safe) to use pickle to load data from untrusted sources.

This codemod replaces calls to pickle.load with fickling.load in Python code. It also adds an import statement for fickling if it is not already present.

The changes look like the following:

- import pickle
+ import fickling
 
- data = pickle.load(file)
+ data = fickling.load(file)

Dependency Updates

This codemod relies on an external dependency. We have automatically added this dependency to your project's pyproject.toml file.

This package provides analysis of pickled data to help identify potential security vulnerabilities.

There are a number of places where Python project dependencies can be expressed, including setup.py, pyproject.toml, setup.cfg, and requirements.txt files. If this change is incorrect, or if you are using another packaging system such as poetry, it may be necessary for you to manually add the dependency to the proper location in your project.

More reading

🧚🤖 Powered by Pixeebot

Feedback | Community | Docs | Codemod ID: pixee:python/harden-pickle-load

@@ -84,6 +84,7 @@ dependencies = [
"black >= 22.6",
"isort >= 5.10",
"parameterized ==0.8.1",
"fickling>=0.1.3,~=0.1.0",
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This package provides analysis of pickled data to help identify potential security vulnerabilities.

License: LGPL-3.0Open SourceMore facts

Copy link

Unsafe Deserialization

Play SecureFlag Play Labs on this vulnerability with SecureFlag!

Description

Unsafe Deserialization (also referred to as Insecure Deserialization) is a vulnerability wherein malformed and untrusted data input is insecurely deserialized by an application. It is exploited to hijack the logic flow of the application end might result in the execution of arbitrary code. Although this isn't exactly a simple attack to employ, it featured in OWASP's Top 10 most recent iteration as part of the Software and Data Integrity Failures risk, due to the severity of impact upon compromise.

The process of converting an object state or data structure into a storable or transmissible format is called serialization. Deserialization is its opposite - the process of extracting the serialized data to reconstruct the original object version.

Unsafe Deserialization issues arise when an attacker is able to pass ad hoc malicious data into user-supplied data to be deserialized. This could result in arbitrary object injection into the application that might influence the correct target behavior.

Read more

Impact

A successful Unsafe Deserialization attack can result in the full compromise of the confidentiality, integrity, and availability of the target system, and the oft-cited Equifax breach is probably the best example of the worst outcome that can arise. In Equifax's case, an unsafe Java deserialization attack leveraging the Struts 2 framework resulted in remote code execution, which, in turn, led to the largest data breach in history.

Prevention

It is important to consider any development project from an architectural standpoint to determine when and where serialization is necessary. If it is unnecessary, consider using a simpler format when passing data.

In cases where it is impossible to forego serialization without disrupting the application's operational integrity, developers can implement a range of defence-in-depth measures to mitigate the chances of being exploited.

  • Use serialization that only permits primitive data types.
  • Use a serialization library that provides cryptographic signature and encryption features to ensure serialized data are obtained untainted.
  • Authenticate before deserializing.
  • Use low privilege environments to isolate and run code that deserializes.

Finally, if possible, replace object serialization with data-only serialization formats, such as JSON.

Testing

Verify that serialization is not used when communicating with untrusted clients. If this is not possible, ensure that adequate integrity controls (and possibly encryption if sensitive data is sent) are enforced to prevent deserialization attacks including object injection.

View this in the SecureFlag Knowledge Base

Micro-Learning Topic: Deserialization attack (Detected by phrase)

Matched on "deserialization attack"

What is this? (2min video)

It is often convenient to serialize objects for communication or to save them for later use. However, serialized data or code can be modified. This malformed data or unexpected data could be used to abuse application logic, deny service, or execute arbitrary code when deserialized. This is usually done with "gadget chains

Try a challenge in Secure Code Warrior

Helpful references

Copy link

Unable to locate .performanceTestingBot config file

@pep8speaks
Copy link

Hello @pixeebot[bot]! Thanks for opening this PR. We checked the lines you've touched for PEP 8 issues, and found:

Line 615:80: E501 line too long (80 > 79 characters)

Copy link

cr-gpt bot commented Aug 2, 2024

Seems you are using me but didn't get OPENAI_API_KEY seted in Variables/Secrets for this repo. you could follow readme for more information

Copy link

korbit-ai bot commented Aug 2, 2024

You’ve installed Korbit to your Github repository but you haven’t created a Korbit account yet!

To create your Korbit account and get your PR scans, please visit here

Copy link

The files' contents are under analysis for test generation.

Copy link

semanticdiff-com bot commented Aug 2, 2024

Review changes with SemanticDiff.

Analyzed 1 of 2 files.

Overall, the semantic diff is 9% smaller than the GitHub diff.

Filename Status
pyproject.toml Unsupported file format
✔️ evennia/contrib/grid/xyzgrid/xymap.py 8.61% smaller

Copy link

restack-app bot commented Aug 2, 2024

No applications have been configured for previews targeting branch: main. To do so go to restack console and configure your applications for previews.

Copy link

git-greetings bot commented Aug 2, 2024

Thanks @pixeebot[bot] for opening this PR!

For COLLABORATOR only :

  • To add labels, comment on the issue
    /label add label1,label2,label3

  • To remove labels, comment on the issue
    /label remove label1,label2,label3

Copy link

instapr bot commented Aug 2, 2024

Feedback

  • Good job on addressing the security issue with pickle.load by replacing it with fickling.load.
  • Ensure that the fickling library is documented as a dependency in pyproject.toml for transparency and clarity.
  • Consider revisiting exception handling in case of failures during loading.

Overall, a solid improvement in security practices.

Copy link

👋 Hi there!

  1. Import statements: Ensure all imports are consistent and follow a standardized format.
  2. Error handling: Review exception handling for potential issues or unhandled cases.
  3. Dependency management: Check that dependencies are added correctly with appropriate versioning constraints in the pyproject.toml file.


Automatically generated with the help of gpt-3.5-turbo.
Feedback? Please don't hesitate to drop me an email at webber@takken.io.

Copy link

senior-dev-bot bot commented Aug 2, 2024

Hi there! 👋 Thanks for opening a PR. It looks like you've already reached the 5 review limit on our Basic Plan for the week. If you still want a review, feel free to upgrade your subscription in the Web App and then reopen the PR

@labels-and-badges labels-and-badges bot added NO JIRA This PR does not have a Jira Ticket PR:size/XS Denotes a Pull Request that changes 0-9 lines. release This PR is a release labels Aug 2, 2024
Copy link

difflens bot commented Aug 2, 2024

View changes in DiffLens

Copy link

Potential issues, bugs, and flaws that can introduce unwanted behavior:

  1. fickling module is imported but not used anywhere in the codebase, which can lead to unused imports and potential confusion for future developers.

Code suggestions and improvements for better exception handling, logic, standardization, and consistency:

  1. xymap.py:

    • Consider removing the unused fickling import if it's not required for the codebase to avoid clutter and confusion.
    • If the intention is to replace the usage of pickle.load with fickling.load, ensure that fickling has a similar implementation to handle loading data from files or adjust the code accordingly.
  2. pyproject.toml:

    • Ensure that the dependencies specified in pyproject.toml are actually required for the project. If fickling is added as a dependency, make sure that it serves a purpose and is not accidentally added without necessity.

Copy link

git-greetings bot commented Aug 2, 2024

PR Details of @pixeebot[bot] in evennia :

OPEN CLOSED TOTAL
2 4 6

Copy link

coderabbitai bot commented Aug 2, 2024

Important

Review skipped

Auto reviews are limited to specific labels.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

Share
Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai generate interesting stats about this repository and render them as a table.
    • @coderabbitai show all the console.log statements in this repository.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (invoked as PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Additionally, you can add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Micro-Learning Topic: Insecure deserialization (Detected by phrase)

Matched on "Unsafe Deserialization"

What is this? (2min video)

It is often convenient to serialize objects for communication or to save them for later use. However, serialized data or code can be modified. This malformed data or unexpected data could be used to abuse application logic, deny service, or execute arbitrary code when deserialized. This is usually done with "gadget chains

Try a challenge in Secure Code Warrior

Helpful references

Micro-Learning Topic: Template Object Injection (Detected by phrase)

Matched on "object injection"

What is this? (2min video)

Instantiating a template using a user-controlled object is vulnerable to local file read and potential remote code execution.

Try a challenge in Secure Code Warrior

Helpful references

Copy link

guide-bot bot commented Aug 2, 2024

Thanks for opening this Pull Request!
We need you to:

  1. Fill out the description.

    Action: Edit description and replace <!- ... --> with actual values.

Comment on lines 613 to 618
with open(self.pathfinder_baked_filename, "rb") as fil:
try:
mapstr, dist_matrix, pathfinding_routes = pickle.load(fil)
mapstr, dist_matrix, pathfinding_routes = fickling.load(fil)
except Exception:
logger.log_trace()
if (

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code catches a generic Exception when loading data with fickling.load(fil). Catching generic exceptions can make debugging difficult and may hide other issues. It's better to catch specific exceptions that fickling.load might raise, such as fickling.exceptions.UnpicklingError.

Recommended Solution:
Catch specific exceptions related to the fickling.load function to improve error handling and debugging.

Copy link

squash-labs bot commented Aug 2, 2024

Manage this branch in Squash

Test this branch here: https://pixeebotdrip-2024-08-02-pixee-2n2gl.squash.io

Copy link

gooroo-dev bot commented Aug 2, 2024

Please double check the following review of the pull request:

Issues counts

🐞Mistake 🤪Typo 🚨Security 🚀Performance 💪Best Practices 📖Readability ❓Others
0 0 0 0 1 0 0

Changes in the diff

  • ➕ Added fickling library to pyproject.toml dependencies.
  • ➕ Replaced pickle.load() with fickling.load() in xymap.py to harden against deserialization attacks.

Identified Issues

ID Type Details Severity Confidence
1 💪Best Practices fickling.load() should be used with caution and proper validation. 🟠Medium 🟠Medium

Issue 1: 💪Best Practices

Explanation:

The fickling library is used to replace pickle for deserialization. While fickling offers some security improvements, it is still crucial to validate the data being deserialized to ensure it is safe and expected.

Code to Address the Issue:

import fickling

def safe_load(file):
    try:
        data = fickling.load(file)
        # Add validation logic here
        if not isinstance(data, tuple) or len(data) != 3:
            raise ValueError("Invalid data format")
        return data
    except Exception as e:
        logger.log_trace()
        raise e

# Usage
with open(self.pathfinder_baked_filename, "rb") as fil:
    try:
        mapstr, dist_matrix, pathfinding_routes = safe_load(fil)
    except Exception:
        logger.log_trace()

Explanation of the Fix:

The safe_load function wraps fickling.load() and includes a basic validation step to ensure the data format is as expected. This helps mitigate potential risks associated with deserialization.

Missing Tests

Test for safe_load function:

import unittest
from unittest.mock import mock_open, patch
import fickling

class TestSafeLoad(unittest.TestCase):
    def test_safe_load_valid_data(self):
        mock_data = fickling.dumps(("mapstr", "dist_matrix", "pathfinding_routes"))
        with patch("builtins.open", mock_open(read_data=mock_data)):
            with open("dummy_file", "rb") as fil:
                result = safe_load(fil)
                self.assertEqual(result, ("mapstr", "dist_matrix", "pathfinding_routes"))

    def test_safe_load_invalid_data(self):
        mock_data = fickling.dumps("invalid_data")
        with patch("builtins.open", mock_open(read_data=mock_data)):
            with open("dummy_file", "rb") as fil:
                with self.assertRaises(ValueError):
                    safe_load(fil)

if __name__ == "__main__":
    unittest.main()

This test ensures that the safe_load function correctly handles both valid and invalid data formats.

Summon me to re-review when updated! Yours, Gooroo.dev
Feel free to react or reply with your feedback!

@gstraccini gstraccini bot requested a review from D0LLi August 2, 2024 07:59
@labels-and-badges labels-and-badges bot added the PR:APPROVED Review is approved label Aug 2, 2024
Copy link

lang-ci bot commented Aug 2, 2024

Issues Summary

1. Operation Cancellation Error

Logs Summary: The operation was canceled during the job execution.

Failing Step:

The operation was canceled.

Related Source Files:

None

Related Failures:

test-suite / Test (3.10, postgresql)

test-suite / Test (3.11, postgresql)


ℹ️ Help (You can turn this bot off by adding a comment /ai off, or force a refresh of this report with /ai ...)

For more support, join our Discord channel

Copy link

nudge-bot bot commented Aug 5, 2024

Hello @D0LLi. The PR is blocked on your approval. Please review it ASAP.

4 similar comments
Copy link

nudge-bot bot commented Aug 6, 2024

Hello @D0LLi. The PR is blocked on your approval. Please review it ASAP.

Copy link

nudge-bot bot commented Aug 7, 2024

Hello @D0LLi. The PR is blocked on your approval. Please review it ASAP.

Copy link

nudge-bot bot commented Aug 8, 2024

Hello @D0LLi. The PR is blocked on your approval. Please review it ASAP.

Copy link

nudge-bot bot commented Aug 9, 2024

Hello @D0LLi. The PR is blocked on your approval. Please review it ASAP.

Copy link
Author

pixeebot bot commented Aug 10, 2024

I'm confident in this change, but I'm not a maintainer of this project. Do you see any reason not to merge it?

If this change was not helpful, or you have suggestions for improvements, please let me know!

Copy link
Author

pixeebot bot commented Aug 11, 2024

Just a friendly ping to remind you about this change. If there are concerns about it, we'd love to hear about them!

Copy link

nudge-bot bot commented Aug 12, 2024

Hello @D0LLi. The PR is blocked on your approval. Please review it ASAP.

1 similar comment
Copy link

nudge-bot bot commented Aug 13, 2024

Hello @D0LLi. The PR is blocked on your approval. Please review it ASAP.

Copy link
Author

pixeebot bot commented Aug 17, 2024

This change may not be a priority right now, so I'll close it. If there was something I could have done better, please let me know!

You can also customize me to make sure I'm working with you in the way you want.

@pixeebot pixeebot bot closed this Aug 17, 2024
Copy link

@gitginie gitginie bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pixeebot[bot]
Thank you for your contribution to this repository! We appreciate your effort in closing pull request.
Happy coding!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🚦awaiting triage 🤖 bot NO JIRA This PR does not have a Jira Ticket PR:size/XS Denotes a Pull Request that changes 0-9 lines. release This PR is a release size/XS
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants