Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ValueError: Supported types are: <class 'str'> or typing.Callable. Got <class 'numpy._ArrayFunctionDispatcher'> instead. #104

Closed
wakame1367 opened this issue Jul 11, 2023 · 1 comment · Fixed by #105
Assignees
Labels
bug Something isn't working

Comments

@wakame1367
Copy link
Collaborator

wakame1367 commented Jul 11, 2023

概要

実行環境

Python 3.9.10 (tags/v3.9.10:f2f3f53, Jan 17 2022, 15:14:21) [MSC v.1929 64 bit (AMD64)] on win32

エラーメッセージ

pytestを実行したところ次のエラーメッセージが出力されました。

> pytest
========================================================================= short test summary info ========================================================================= 
FAILED tests/feature/test_groupby.py::test_return_type_by_aggregation - ValueError: Supported types are: <class 'str'> or typing.Callable. Got <class 'numpy._ArrayFunctionDispatcher'> instead.
FAILED tests/feature/nlp/test_bert.py::test_bert_jp - requests.exceptions.ConnectionError: HTTPSConnectionPool(host='cdn-lfs.huggingface.co', port=443): Read timed out.    
======================================================== 2 failed, 106 passed, 1084 warnings in 332.27s (0:05:32) ========================================================= 
pytest FAILURES message

_____________________________________________________________________ test_return_type_by_aggregation _____________________________________________________________________

iris_dataframe = ( sl sw pl pw species
0 5.1 3.5 1.4 0.2 0.0
1 4.9 3.0 1.4 0.2 0.0
2 4.7 3.2 1.3... 3.4 5.4 2.3 2.0
149 5.9 3.0 5.1 1.8 2.0

[150 rows x 5 columns], 'species', ['sl', 'sw', 'pl', 'pw'])

def test_return_type_by_aggregation(iris_dataframe):
    df, group_key, group_values = iris_dataframe
    agg_methods = ["max", np.sum, custom_function]
  new_df, new_cols = aggregation(df, group_key, group_values,
                                   agg_methods)

tests\feature\test_groupby.py:27:


input_df = sl sw pl pw species
0 5.1 3.5 1.4 0.2 0.0
1 4.9 3.0 1.4 0.2 0.0
2 4.7 3.2 1.3 ... 6.5 3.0 5.2 2.0 2.0
148 6.2 3.4 5.4 2.3 2.0
149 5.9 3.0 5.1 1.8 2.0

[150 rows x 5 columns]
group_key = 'species', group_values = ['sl', 'sw', 'pl', 'pw']
agg_methods = ['max', <function sum at 0x000002226B6C2CF0>, <function custom_function at 0x000002221318C280>]

def aggregation(
        input_df: pd.DataFrame,
        group_key: str,
        group_values: List[str],
        agg_methods: List[Union[str, FunctionType]],
) -> Tuple[pd.DataFrame, List[str]]:
    """
    Aggregate values after grouping table rows by a given key.

    Args:
        input_df:
            Input data frame.
        group_key:
            Used to determine the groups for the groupby.
        group_values:
            Used to aggregate values for the groupby.
        agg_methods:
            List of function or function names, e.g. ['mean', 'max', 'min', numpy.mean].
            Do not use a lambda function because the name attribute of the lambda function cannot generate a unique string of column names in <lambda>.
    Returns:
        Tuple of output dataframe and new column names.
    """
    new_df = input_df.copy()

    new_cols = []
    for agg_method in agg_methods:
        if _is_lambda_function(agg_method):
            raise ValueError('Not supported lambda function.')
        elif isinstance(agg_method, str):
            pass
        elif isinstance(agg_method, FunctionType):
            pass
        else:
          raise ValueError('Supported types are: {} or {}.'
                             ' Got {} instead.'.format(str, Callable, type(agg_method)))

E ValueError: Supported types are: <class 'str'> or typing.Callable. Got <class 'numpy._ArrayFunctionDispatcher'> instead.

nyaggle\feature\groupby.py:89: ValueError

エラー原因

テストコードではaggregationの引数agg_methodsの期待としてnumpy.sumが渡されています。

def test_return_type_by_aggregation(iris_dataframe):
df, group_key, group_values = iris_dataframe
agg_methods = ["max", np.sum, custom_function]
new_df, new_cols = aggregation(df, group_key, group_values,
agg_methods)
assert isinstance(new_df, pd.DataFrame)
assert isinstance(new_cols, list)

aggregationの引数agg_methodsの期待として以下の3つのみサポートされていますが

  • <class 'str'>
  • <class 'function'>
  • lambda

numpy.sumのクラスは<class 'numpy._ArrayFunctionDispatcher'>であるため、if文ではじかれるようになっています。

for agg_method in agg_methods:
if _is_lambda_function(agg_method):
raise ValueError('Not supported lambda function.')
elif isinstance(agg_method, str):
pass
elif isinstance(agg_method, FunctionType):
pass
else:
raise ValueError('Supported types are: {} or {}.'
' Got {} instead.'.format(str, Callable, type(agg_method)))

修正案

#105

@wakame1367 wakame1367 added enhancement New feature or request bug Something isn't working and removed enhancement New feature or request labels Jul 11, 2023
@wakame1367 wakame1367 self-assigned this Jul 11, 2023
@nyanp
Copy link
Owner

nyanp commented Jul 11, 2023

@wakame1367
最新のnumpyにおける以下の変更が影響しているようです。
numpy/numpy#24019

Callableによる判定だとintなども通してしまうので、inspect.isroutineを使うのが良いかもしれません。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants