Skip to content

Contributor guidelines

Ian Hellen edited this page Dec 14, 2023 · 17 revisions

Contributor Guidance:

Contributions of improvements, fixes and new features are all welcomed. Whether this is your first time contributing to a project or you are a seasoned Open-Source contributor we welcome you contribution. In this guide you can find a few pointers to help you create a great contribution.

What to contribute:

There are many things that can make a good contribution. It might be a fix for a specific issue you have come across, an improvement to an existing feature that you have thought about such as a new data connector or threat intelligence provider, or a completely new feature.

If you don’t have a specific idea in mind take a look at the Issues page on GitHub: https://github.com/microsoft/msticpy/issues

This page tracks a range of issues, enhancements, and features that members of the community have thought of. The MSTICPy team uses these issues as a way to track work and includes many things we have added ourselves.

The issues are tagged with various descriptions that relate to the type of issue. You may see some with the ‘good first issue’ tag. These are issues that the team think would make a good issue for someone contributing to MSTICPy for the first time, however anyone is welcome to work on any Issue. If you decide to start working on an Issue please make a comment on the Issue so that we can assign it to you and other members of the community know that it is being worked on and don’t duplicate work. Also if you are unclear about what the Issue feel free to comment on the Issue to get clarification from others.

What makes a good contribution

Whilst there is no one thing that makes a contribution good here are some guidelines:

Single Issues

Focus your contribution on a single thing per PR raised, whether it be a feature or a fix. If you have multiple things you want to contribute considers splitting them into multiple PRs. Keeping it to a single item makes it easier for others to see what you are contributing and how it fits with the rest of the project.

Documentation

Make it clear what you are contributing, why its important, and how it works. This provides much needed clarity for others when reviewing contributions and helps to highlight the great value in your contribution.

Test Coverage

Ensure your contribution has the highest possible of test coverage. You should aim for a least 80% coverage and ideally reach 100%. If you can’t reach 80% for what ever reason let us know when you raise a PR and we can work with you on this.

Using Git

To contribute you will need to fork the MSTICPy repo, create a branch for your contribution, and then raise a Pull Request (PR) to merge the changes back into MSTICPy’s main branch. Please do not make changes to main of your fork and submit this as a PR. You should also consider granting permission on your fork so that we can push changes back to your forked branch. Sometimes, it's quicker for us to make a quick change to fix something than to ask you to make the change. If we cannot push any changes back this is impossible to do.

If you are unfamiliar with Git and GitHub you can find some guidance here: https://docs.github.com/en/get-started/quickstart/set-up-git

Where to get help

The MSTICPy team is more than happy to help support your contributions, if you need help you can comment on the Issue you are working on, or email msticpy@microsoft.com

Code Guidelines:

Unit Tests

All new code should have unit tests with at least 80% code coverage. There are some exceptions to this: for example, code that accesses online data and requires authentication. We can work with you on getting this to work in our build. We use pytest but most of the existing tests are also Python unittest compatible.

Type hints

Use type annotations for parameters and return values in public methods, properties and functions.

from typing import Any, Dict, Optional, Union

...

def build_process_tree(
    procs: pd.DataFrame,
    schema: Union[ProcSchema, Dict[str, Any]] = None,
    show_summary: bool = False,
    debug: bool = False,
    **kwargs,
) -> pd.DataFrame:
    """
    Build process trees from the process events.

Python Type Hints documentation

Docstrings

Our documentation is automatically built for Readthedocs using Sphinx. All public modules, functions, classes and methods should be documented using the numpy documenation standard.

def build_process_tree(
    procs: pd.DataFrame,
    schema: Union[ProcSchema, Dict[str, Any]] = None,
    show_summary: bool = False,
    debug: bool = False,
    **kwargs,
) -> pd.DataFrame:
    """
    Build process trees from the process events.

    Parameters
    ----------
    procs : pd.DataFrame
        Process events (Windows 4688 or Linux Auditd)
    schema : Union[ProcSchema, Dict[str, Any]], optional
        The column schema to use, by default None.
        If supplied as a dict it must include definitions for the
        required fields in the ProcSchema class
        If None, then the schema is inferred
    show_summary : bool
        Shows summary of the built tree, default is False.
    debug : bool
        If True produces extra debugging output,
        by default False

    Returns
    -------
    pd.DataFrame
        Process tree dataframe.

    See Also
    --------
    ProcSchema

    """

numpy docstring guide

Code Formatting

We use black everywhere and enforce this in the build.

Black - The Uncompromising Code Formatter

Linters/Code Checkers

We use the following code checkers:

  • pylint: pylint msticpy
  • mypy: mypy msticpy
  • bandit: bandit -r -lll -s B303,B404,B603,B607 msticpy
  • flake8: flake8 --max-line-length=90 --ignore=E501,W503 --exclude tests
  • pydocstyle: pydocstyle --convention=numpy msticpy
  • isort: isort --profile black msticpy

Pre-Commit

We have a pre-commit configuration in the msticpy repo. This runs the checks above (apart from mypy) when you commit. See Pre-Commit Script for more details.

Note - using pre-commit in an Anaconda environment you may see SSL errors when pre-commit is trying to build the python environment to run checks in. There's a simple fix: in the conda environment that you are running pre-commit from copy the DLLs libcrypto-1_1-x64.dll and libssl-1_1-x64.dll
from {path_to_conda}/envs/{your_env_name}/Library/bin
to {path_to_conda}/envs/{your_env_name}/DLLs

You may need to re-activate the anaconda environment. See this Stackoverflow article for more details.

VSCode support

See this page for task definitions to run linters/checkers in VSCode

Create a branch

Before you submit a PR, create working branch in your fork and put your changes in that. It's going to make it easier for you to re-sync the main branch if this gets updated while you are working on your changes.

See also

A musical guide

The PEP8 Song

Brilliantly written and performed by @lemonsaurus_rex