Skip to content

Helper function to extract a basic version of the metadata from a pandas DataFrame. #3

@lopezco

Description

@lopezco

Add helper function to extract a basic version of the metadata from a pandas DataFrame.
Ex:

from pandas.api.types import is_string_dtype, is_numeric_dtype, is_categorical_dtype

def metadata_from_dataframe(df):
  metadata = []
  for c in df.columns:
    if is_categorical_dtype(df[c]):
      metadata.append({
          'name': c,
          'type': 'category',
          'categories': sorted(df[c].dtype.categories.values.tolist())})
    elif is_numeric_dtype(df[c]):
      metadata.append({
          'name': c,
          'type': 'numeric'})
    elif is_string_dtype(df[c]):
      metadata.append({
          'name': c,
          'type': 'string'})
    else:
      raise ValueError('Unknown type for {}'.format(c))
  return metadata

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions