Skip to content

Refactor OpenMined PyDP README#279

Merged
chinmayshah99 merged 2 commits intoOpenMined:devfrom
8bitmp3:patch-1
Aug 29, 2020
Merged

Refactor OpenMined PyDP README#279
chinmayshah99 merged 2 commits intoOpenMined:devfrom
8bitmp3:patch-1

Conversation

@8bitmp3
Copy link
Copy Markdown
Contributor

@8bitmp3 8bitmp3 commented Aug 28, 2020

Hi @chinmayshah99 ! I went though the OpenMined PyDP README and introduced some improvements. Let me know what you think.

  1. The PyDP background section

    • Split long paragraphs with many sentences into several smaller paragraphs - it can make it easier to scan when reading
    • Use active voice when you can - it's better for readability than passive voice
    • Add "more and more" to users that use ML/data science
    • Use "machine learning" or "data science" instead "data analytics" (it may be a too narrow of a definition)
    • Add "more" to "innovative solutions"
    • Rephrase: "bring in some privacy concerns" -> "can cause privacy issues"
    • Add "potentially" and passive voice ("[have] been trained on") in "the data they've trained on and could leak"
    • Refactor a section on why differential privacy: -> "To help measure sensitive data leakage and reduce the possibility of it happening, there is a mathematical framework called differential privacy."
    • Diff. privacy (not PyDP, which is a Py-wrapper) helps reduce data leakage, so: instead of "...PyDP is helping us achieve better privacy" -> "...with PyDP you can control the privacy guarantee and accuracy of your model in Python."
  2. The Installation section

    • Use "macOS" a platform name, like Linux or Windows (vs OSX, which is an OX version)
    • Change "real soon" -> "soon" (less informal)
    • PiPy is the pip manager -> as in: use PiPy to "pip install", so to speak
  3. Change the title from "Usage" to "Examples"

    • Add an explanation to the dev/examples link, such as "Curated list of tutorials and sample code for PyDP", instead of just saying "this example"

    • Change "Jupyer Notebook" [typo] -> "An introduction to PyDP (a Jupyter notebook)"

    • Rephrase: "for use via code explanation..." (it sounded a bit strange) -> "To get started..."

    • Rephrase "a sample of usage" -> "Example: calculate the Bounded Mean" (more specific)

    • Refactor code comments:

      - # To calculate the Bounded Mean
      - # epsilon is a number between 0 and 1 denoting privacy threshold
      - # It measures the acceptable loss of privacy (with 0 meaning no loss is acceptable)
      - # If both the lower and upper bounds are specified, 
      - # x = dp.BoundedMean(epsilon: double, lower: int, upper: int)
      + # Calculate the Bounded Mean
      + # Structure: `BoundedMean(epsilon: double, lower: int, upper: int)`
      + # `epsilon`: a Double, between 0 and 1, denoting the privacy threshold,
      + #            measures the acceptable loss of privacy (with 0 meaning no loss is acceptable)
      + # `lower` and `upper`: Integers, representing lower and upper bounds, respectively
      ...
      - # If lower and upper bounds are not specified,
      - # DP library automatically calculates these bounds
      + # If the lower and upper bounds are not specified,
      + # PyDP automatically calculates these bounds

Cosmetic changes:

  • Limit each line to 80 characters (common practice) and remove trailing spaces
  • Add blank lines between section titles and first paragraphs

Check out this fairly recent paper: Deep Learning with Differential Privacy (July 28, 2020): https://www.arxiv-vanity.com/papers/1607.00133/ (OpenAI/Google)

Major diffs:

- In today's data-driven world, data analytics is used by researchers or data
- scientists to create better models or innovative solutions for a better future.
+ In today's data-driven world, more and more researchers and data scientists
+ use machine learning to create better models or more innovative solutions for
+ a better future.
...
- These models often tend to handle sensitive or personal data, which bring in
- some privacy concerns.
+ These models often tend to handle sensitive or personal data, which can cause
+ privacy issues.
...
- the data they've trained on and could leak these details later on. Differential
+ the data they've been trained on and could potentially leak these details
+ later on. Differential
...

- Differential privacy is a mathematical framework for measuring this privacy leakage and
- reducing the possibility of it happening.
+ To help measure sensitive data leakage and reduce the possibility of it
+ happening, there is a mathematical framework called differential privacy.

- This is where PyDP comes in. PyDP is a Python wrapper for Google's [Differential
- Privacy](https://github.com/google/differential-privacy) project. The library
+ In 2020, OpenMined created a Python wrapper for Google's [Differential
+ Privacy](https://github.com/google/differential-privacy) project called PyDP.
+ The library
  provides a set of ε-differentially private algorithms, which can be used to
  produce aggregate statistics over numeric data sets containing private or
- sensitive information. Thus, PyDP is helping us achieve better privacy.
+ sensitive information. Therefore, with PyDP you can control the privacy
+ guarantee and accuracy of your model written in Python.
- **Things to remember about PyDP :**
+ **Things to remember about PyDP:**
...
- - :fire: Currently supports Linux and OSX. (Windows coming real soon... :smiley:)
+ - :fire: Currently supports only Linux and macOS (Windows support coming soon :smiley:)
- - :star: Supports all the Python 3+ versions.
+ - :star: Use Python 3.x.
- Use the package manager [pip](https://pip.pypa.io/en/stable/) to install PyDP.
+
+ To install PyDP, use the [PiPy](https://pip.pypa.io/en/stable/) package manager:
...
+ (If you have `pip3` separately for Python 3.x, use `pip3 install python-dp`)
- # Usage
-
- Refer to [this example](https://github.com/OpenMined/PyDP/tree/dev/examples) to understand PyDP library usage.
+ # Examples
+
+ Refer to the
+ [curated list](https://github.com/OpenMined/PyDP/tree/dev/examples)
+ of tutorials and sample code to learn more about the PyDP library.
...
- For usage via code explanation, refer to
- [Jupyer Notebook]https://github.com/OpenMined/PyDP/blob/dev/examples/carrots_demo/carrots_demo.ipynb)
- or [Python file](https://github.com/OpenMined/PyDP/blob/dev/examples/carrots.py) for carrot demo.
+ You can also get started with
+ [an introduction to PyDP](https://github.com/OpenMined/PyDP/blob/dev/examples/carrots_demo/carrots_demo.ipynb)
+ (a Jupyter notebook) and
+ [the carrots demo](https://github.com/OpenMined/PyDP/blob/dev/examples/carrots_demo/carrots.py)
+ (a Python file).
...
- A sample of usage can be found below:
+ Example: calculate the Bounded Mean
...

```diff
- Some of the good learning resources to get started with Python differential
- privacy (PyDP) project and understand the concepts behind it can be found
- [here](https://github.com/OpenMined/PyDP/blob/dev/resources.md).
+
+ Go to [resources]](https://github.com/OpenMined/PyDP/blob/dev/resources.md)
+ to learn more about differential privacy.
- ## Support
+ ## Support and Community on Slack
+
- For support in using this library, please join the **#lib_pydp** Slack channel.
- If you’d like to follow along with any code changes to the library, please join
- the **#code_dp_python** Slack channel.
- [Click here to join our Slack community!](https://slack.openmined.org)
+ If you have questions about the PyDP library, join
+ [OpenMined's Slack](https://slack.openmined.org) and check the
+ **#lib_pydp** channel. To follow the code source changes, join
+ **#code_dp_python**.
- If you'd like to contribute to this project please read these
- [guidelines](https://github.com/OpenMined/PyDP/blob/dev/contributing.md).
+ To contribute to the PyDP project, read the
+ [guidelines](https://github.com/OpenMined/PyDP/blob/dev/contributing.md).
...
- Pull requests are welcome. For major changes, please open an issue first to
- discuss what you would like to change.
+ Pull requests are welcome. If you want to introduce major changes, please
+ open an issue first to discuss what you would like to change.

Copy link
Copy Markdown
Member

@chinmayshah99 chinmayshah99 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We made changes to the packaging structure but never updated the readme. Can you please make these changes?

Comment thread README.md Outdated
Comment thread README.md
@chinmayshah99
Copy link
Copy Markdown
Member

Hey @8bitmp3 these suggestions look great. Can these be conceptualized and put into contributing guidelines?

@8bitmp3
Copy link
Copy Markdown
Contributor Author

8bitmp3 commented Aug 28, 2020

Thank and no worries @chinmayshah99 ! For Contributing, we can add a short style guide later down the road. It's a great idea.

@chinmayshah99 chinmayshah99 merged commit 8d97c15 into OpenMined:dev Aug 29, 2020
@8bitmp3 8bitmp3 mentioned this pull request Aug 31, 2020
4 tasks
@8bitmp3 8bitmp3 deleted the patch-1 branch September 2, 2020 13:46
dvadym added a commit to dvadym/PyDP that referenced this pull request Jul 3, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants