Skip to content
master
Switch branches/tags
Code

Latest commit

 

Git stats

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

About category_encoders

Home: https://github.com/scikit-learn-contrib/categorical_encoding

Package license: BSD-3-Clause

Feedstock license: BSD 3-Clause

Summary: A collection sklearn transformers to encode categorical variables as numeric

A set of scikit-learn-style transformers for encoding categorical variables into numeric with different techniques. While ordinal, one-hot, and hashing encoders have similar equivalents in the existing scikit-learn version, the transformers in this library all share a few useful properties:

  • First-class support for pandas dataframes as an input (and optionally as output)

  • Can explicitly configure which columns in the data are encoded by name or index, or infer non-numeric columns regardless of input type

  • Can drop any columns with very low variance based on training set optionally

  • Portability: train a transformer on data, pickle it, reuse it later and get the same thing out.

  • Full compatibility with sklearn pipelines, input an array-like dataset like any other transformer

Current build status

All platforms:

Current release info

Name Downloads Version Platforms
Conda Recipe Conda Downloads Conda Version Conda Platforms

Installing category_encoders

Installing category_encoders from the conda-forge channel can be achieved by adding conda-forge to your channels with:

conda config --add channels conda-forge

Once the conda-forge channel has been enabled, category_encoders can be installed with:

conda install category_encoders

It is possible to list all of the versions of category_encoders available on your platform with:

conda search category_encoders --channel conda-forge

About conda-forge

Powered by NumFOCUS

conda-forge is a community-led conda channel of installable packages. In order to provide high-quality builds, the process has been automated into the conda-forge GitHub organization. The conda-forge organization contains one repository for each of the installable packages. Such a repository is known as a feedstock.

A feedstock is made up of a conda recipe (the instructions on what and how to build the package) and the necessary configurations for automatic building using freely available continuous integration services. Thanks to the awesome service provided by CircleCI, AppVeyor and TravisCI it is possible to build and upload installable packages to the conda-forge Anaconda-Cloud channel for Linux, Windows and OSX respectively.

To manage the continuous integration and simplify feedstock maintenance conda-smithy has been developed. Using the conda-forge.yml within this repository, it is possible to re-render all of this feedstock's supporting files (e.g. the CI configuration files) with conda smithy rerender.

For more information please check the conda-forge documentation.

Terminology

feedstock - the conda recipe (raw material), supporting scripts and CI configuration.

conda-smithy - the tool which helps orchestrate the feedstock. Its primary use is in the construction of the CI .yml files and simplify the management of many feedstocks.

conda-forge - the place where the feedstock and smithy live and work to produce the finished article (built conda distributions)

Updating category_encoders-feedstock

If you would like to improve the category_encoders recipe or build a new package version, please fork this repository and submit a PR. Upon submission, your changes will be run on the appropriate platforms to give the reviewer an opportunity to confirm that the changes result in a successful build. Once merged, the recipe will be re-built and uploaded automatically to the conda-forge channel, whereupon the built conda packages will be available for everybody to install and use from the conda-forge channel. Note that all branches in the conda-forge/category_encoders-feedstock are immediately built and any created packages are uploaded, so PRs should be based on branches in forks and branches in the main repository should only be used to build distinct package versions.

In order to produce a uniquely identifiable distribution:

  • If the version of a package is not being increased, please add or increase the build/number.
  • If the version of a package is being increased, please remember to return the build/number back to 0.

Feedstock Maintainers

About

A conda-smithy repository for category_encoders.

Resources

License

Releases

No releases published

Packages

No packages published

Languages