Skip to content

jonnagel/fix_df_cols

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 

Repository files navigation

fix_df_cols

I dislike columns with names that slow my work. This package will convert all columns names to snake_case, using the following rules:

  1. Everything (all columns) are converted to lowercase.
  2. All spaces are replaced with underscores.
  3. Everything that isn't a letter, digit, or underscore (in a column name) is removed.

Installation

git clone https://github.com/jonnagel/fix_df_cols.git

TODO

Usage

Create a pd.DataFrame as normal, then run clean() method to fix the column names. This adds a clean() method to all pd.DataFrames, calling this method fixes the columns in place.

from fix_df_cols.src.fixdfcols import CleanDF
bad_df = pd.DataFrame(columns=['abc -@#ab%@', '12 3', 'a**bcCCC'])
# bad_df.columns
#   Index(['abc -@#ab%@', '12 3', 'a**bc'], dtype='object')
bad_df.clean()
# bad_df.columns
#   Index(['abc_ab', '12_3', 'abc'], dtype='object')

Standalone example

# a python list or pd.Index works as a manual fix 
from src.fixdfcols import FixCols
clean_cols = FixCols(['abc -@#ab%@', '12 3', 'a**bcEB']).columns_clean
# clean_cols
# ['abc_ab', '12_3', 'abceb']

Contributing

Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.

License

MIT

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages