Skip to content
This repository has been archived by the owner on Feb 2, 2024. It is now read-only.

Fix for string tests failed due to bad str_overload #94

Merged
merged 5 commits into from
Jul 22, 2019

Conversation

kozlov-alexey
Copy link
Contributor

No description provided.

@fschlimb
Copy link
Contributor

fschlimb commented Jul 17, 2019

This looks good.
Please provide a sentence or two per loop in a comment telling what it does/how it works.

hpat/str_ext.py Outdated
# for strip methods define overloads to call Numba implementation
# forwarding arg1 as a necessary 'chars' argument
for method in str2str_1arg:
func_text = "def str_overload(in_str, arg1):\n"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think it is necessary to implement several function via auto generated source?
I would vote to make them as regular functions with required decorators.
@fschlimb What do you think about it? I mean, it might be better to stay away from such technique of programming.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I usually try to avoid code duplication as much as possible. There will be cases where such code generation is required. So I tend to think it is not too bad.
Even though I am not sure I would now to do this with only decorators, such simpler cases can be handled with higher lever functions which call @overload. Here is a small POC:

from numba import njit, objmode, types
from numba.extending import overload

# dummy default python funcs
def x(a):
    return a+a
def y(a):
    return a*a
def z(a):
    return a**a

# a function generically overloading a given function with one arg
def ovl(func):
    @overload(func)
    def _f(a):
        def _ff(a):
            with objmode(r='int32'):
                r = func(a)
                print(':) -> ', func(a))
            return r
        return _ff

# use the above overloader to overload x and y
ovl(x)
ovl(y)

# we can now use them in a jitted function
@njit
def z():
    return x(4) + y(4)

r = z()
print('24 ?=', r)

I am not sure which one I like better. I am torn.

Copy link
Contributor

@shssf shssf Jul 17, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fschlimb I agree, code duplication is not good approach but autogeneration is also not good (I would say it is worse than code duplication). I would vote for any implementation that avoid code generation.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fine with me as long as we are not duplicating code.
Just out of curiosity: why do you think code generation is bad? HPAT is a code-generator, numba is, and even python is. So this is not really unusual in this context.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fschlimb Because it non debuggable, non performance assessment, and etc. It only looks like a good approach but if something goes wrong it will be a real headache.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fschlimb We have one more option - generate the code at build stage, but avoid doing this in runtime.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Frank, Sergey, thank you both for provided input!
I will try to re-write it using provided example, as to me it looks better than calling exec (both due to debugging and security considerations).

[BUG] Fixed problems with generation parquet files (IntelPython#93)
@shssf
Copy link
Contributor

shssf commented Jul 19, 2019

@kozlov-alexey Do you plan to make changes in this PR?

@kozlov-alexey
Copy link
Contributor Author

@shssf Yes, I'm about to push it)

@kozlov-alexey
Copy link
Contributor Author

kozlov-alexey commented Jul 19, 2019

As per suggestion above I've rewritten it using decorators and getitem.
Two issues are observed:

  1. Test now runs a bit slower, 4.6-4.7 seconds vs 4.2-4.3 seconds before (not sure why it takes more time), that needs to be investigated probably.
  2. Wasn't able to verify on master due to import error from this line in test_strings.py:
    "from .gen_test_data import ParquetGenerator" (even after installing pyspark). I was able to verify only based on the commit prior to [BUG] Fixed problems with generation parquet files #93.

@shssf
Copy link
Contributor

shssf commented Jul 19, 2019

  1. Wasn't able to verify on master due to import error from this line in test_strings.py:
    "from .gen_test_data import ParquetGenerator" (even after installing pyspark). I was able to verify only based on the commit prior to [BUG] Fixed problems with generation parquet files #93.

PR #96 might help

@fschlimb fschlimb merged commit db2736f into IntelPython:master Jul 22, 2019
@kozlov-alexey kozlov-alexey deleted the feature/fix_strip_overload branch August 8, 2019 10:36
kozlov-alexey added a commit to kozlov-alexey/sdc that referenced this pull request Oct 4, 2019
Fix for string tests failed due to bad str_overload
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants