Fix for Pandas merge overload wrong handling of string literals #99

kozlov-alexey · 2019-07-26T14:56:31Z

No description provided.

[BUG] Fixed problems with generation parquet files (IntelPython#93)

Fixing issue with named series handling in fillna (IntelPython#95)

Merge from public

shssf · 2019-07-26T15:38:52Z

hpat/hiframes/pd_dataframe_ext.py


-        return hpat.hiframes.api.join_dummy(
-            left, right, left_on, right_on, 'asof')
+            return hpat.hiframes.api.join_dummy(left, right, left_on, right_on, 'asof')

    return _impl


I'm not sure but I think these functions (including merge_overload) might be implemented easier. I mean, use if\else clause to calculate proper left_on and right_on variables and return the same _impl function.
At least, if I'm not mistaken, it will took lass code lines

I tried that at first (i.e. assign new_right/left_on to proper values in context of merge_asof_overload() and remove lines 677 and 678 at all, hence using same body for _impl()), but for some reason it breaks currently working test_join1_seq_key_change1 (the only test we have currently to test "on=['A']" case). In other words, it looks like assignment:
left_on = right_on = on
only works for lists if it's in the context of _impl(). The error was as below:
KeyError: "Failed in hpat mode pipeline (step: convert to distributed)\nlist('A',)""
probably because the type of 'on' in context of merge_asof_overload() is not the same as type of 'on' in _impl(), hence providing one instead of the other causes failure.
So I ended up using two different overloads - legacy one for lists and new one taking literal values for strings.
Maybe there's a way to copy Numba's ConstList value 'literally' to new_left/right_on variables, but I don't know what it is.

fschlimb · 2019-07-29T08:31:43Z

Ignore my previous (deleted) comment.
What is the error of the original code?

kozlov-alexey · 2019-07-29T09:43:33Z

@fschlimb
The original error was "ValueError: Failed in hpat mode pipeline (step: typed dataframe pass)" in
get_const_or_list() in dataframe_pass.py. I'm attaching the file with full traceback here: https://github.com/IntelPython/hpat/files/3441340/full_traceback_for_test_join.txt

fschlimb · 2019-07-29T09:58:19Z

You should make sure you're using a compatible version of numpy
This means you have to understand why get_const_or_list() fails. Its name suggests it is intended to address exactly the issue you observe with literals/consts.

kozlov-alexey · 2019-07-29T12:48:23Z

@fschlimb

OK (it goes away with downgrading to scikit-learn=0.19.1 and numpy=1.16.1)
I believe the problem is in how Numba treats 'on' variable in the original merge implementation.
That is, during Numba type inference 'on' becomes an instance of TypeVar class, so assigning later this value to left_on and right_on will not make them the same constant (we need to assign obj.literal_value for that). Maybe Numba fails to add const names 'left_on' and 'right_on' in IR when it copies Literal[Str] object?

And in _get_const_or_list() we merely try to find const (or lists) objects by name in self.func_ir, which fails due to above (no such const names exist in IR). I have no explanation why it works for lists though, probably, because they are created differently (via build sequence).

@overload_method(DataFrameType, 'merge')
@overload(pd.merge)
def merge_overload(left, right, how='inner', on=None, left_on=None,
right_on=None, left_index=False, right_index=False, sort=False,
suffixes=('_x', '_y'), copy=True, indicator=False, validate=None):

def _impl(left, right, how='inner', on=None, left_on=None,
        right_on=None, left_index=False, right_index=False, sort=False,
        suffixes=('_x', '_y'), copy=True, indicator=False, validate=None):
    if on is not None:
        left_on = right_on = on       # <-------- Even though 'on' was string literal, here we assign 
                                      # some Numba object, not object's literal value, hence left_on and 
                                      # right_on are not found as const objects in self.func_ir later
    return hpat.hiframes.api.join_dummy(
        left, right, left_on, right_on, how)

return _impl

Merging commits from public repo

kozlov-alexey · 2019-08-01T14:45:41Z

@fschlimb
Thank you for helping me sort that out and suggesting a better solution!
I had to modify the check a bit though, i.e. instead of
if isinstance(on, types.NoneType)
use
if isinstance(numba.typeof(on), types.NoneType)

because, with the first version other tests that do not provide 'on' argument fail.
It looks like that check always returns True. It may be because Numba's NoneType is not inherited from python's None:
numba/numba#3590

So we can either compare to NoneType but use typeof(on), as in this commit, or we can use the following negated check:
onHasValidType = isinstance(on, types.StringLiteral, types.List)
both of which appear to work.

fschlimb

The current check is probably better because it auto-allows for more types if!=None

kozlov-alexey · 2019-08-01T17:07:22Z

I had to skip back one of the tests - test_merge_asof_parallel1 - which still fails on NUM_PES=3 build now with a different error and only on the second test run. It can be seen in Travis logfile that it is passed on the first run though, so probably it's due to the problem with
test_series_head_index_parallel1
which is not skipped at the moment, but was identified as a root cause of corrupting memory and causing other test failures.

I'm going to deal with skipping test_series_head_index_parallel1 and unskipping other tests in PR#100 anyway, so I hope it's not a big issue.

Problem description: merge_overload and merge_asof_overload functions use 'on' argument value to compute 'left_on' and 'right_on' arguments in a way that breaks type stability, causing compilation failure when 'on' is assigned a StringLiteral value. Error: File "../hpat/hiframes/dataframe_pass.py", line 202, in _run_assign return self._run_call(assign, lhs, rhs) File "../hpat/hiframes/dataframe_pass.py", line 522, in _run_call return self._run_call_join(assign, lhs, rhs) File "../hpat/hiframes/dataframe_pass.py", line 1488, in _run_call_join left_on = self._get_const_or_list(left_on_var) File "../hpat/hiframes/dataframe_pass.py", line 2135, in _get_const_or_list raise ValueError(err_msg) ValueError: Failed in hpat mode pipeline (step: typed dataframe pass) None Following tests should be fixed with this commit: test_join_cat1 (hpat.tests.test_join.TestJoin) test_join_cat2 (hpat.tests.test_join.TestJoin) test_join_cat_parallel1 (hpat.tests.test_join.TestJoin) test_join_datetime_seq1 (hpat.tests.test_join.TestJoin) test_join_left_seq1 (hpat.tests.test_join.TestJoin) test_join_left_seq2 (hpat.tests.test_join.TestJoin) test_join_outer_seq1 (hpat.tests.test_join.TestJoin) test_join_right_seq1 (hpat.tests.test_join.TestJoin) test_merge_asof_seq1 (hpat.tests.test_join.TestJoin)

…n#99) Problem description: merge_overload and merge_asof_overload functions use 'on' argument value to compute 'left_on' and 'right_on' arguments in a way that breaks type stability, causing compilation failure when 'on' is assigned a StringLiteral value. Error: File "../hpat/hiframes/dataframe_pass.py", line 202, in _run_assign return self._run_call(assign, lhs, rhs) File "../hpat/hiframes/dataframe_pass.py", line 522, in _run_call return self._run_call_join(assign, lhs, rhs) File "../hpat/hiframes/dataframe_pass.py", line 1488, in _run_call_join left_on = self._get_const_or_list(left_on_var) File "../hpat/hiframes/dataframe_pass.py", line 2135, in _get_const_or_list raise ValueError(err_msg) ValueError: Failed in hpat mode pipeline (step: typed dataframe pass) None Following tests should be fixed with this commit: test_join_cat1 (hpat.tests.test_join.TestJoin) test_join_cat2 (hpat.tests.test_join.TestJoin) test_join_cat_parallel1 (hpat.tests.test_join.TestJoin) test_join_datetime_seq1 (hpat.tests.test_join.TestJoin) test_join_left_seq1 (hpat.tests.test_join.TestJoin) test_join_left_seq2 (hpat.tests.test_join.TestJoin) test_join_outer_seq1 (hpat.tests.test_join.TestJoin) test_join_right_seq1 (hpat.tests.test_join.TestJoin) test_merge_asof_seq1 (hpat.tests.test_join.TestJoin)

…telPython#99)" This reverts commit a2a8ee5.

* Remove spark dependency (#102) Remove spark dependency from HPA; use pre-generated sdf_dt.pq * explicitly adding data-file (#104) * HPAT Build: Code style check for C and Python sources (#103) * HPAT Build: Code style check for C and Python sources * PR103. Comments partially addressed * Code style change part 1 (#106) * Style check config fo pystyle (#105) * Fix for pandas.merge wrong overload handling of 'on' args (#99) Problem description: merge_overload and merge_asof_overload functions use 'on' argument value to compute 'left_on' and 'right_on' arguments in a way that breaks type stability, causing compilation failure when 'on' is assigned a StringLiteral value. Error: File "../hpat/hiframes/dataframe_pass.py", line 202, in _run_assign return self._run_call(assign, lhs, rhs) File "../hpat/hiframes/dataframe_pass.py", line 522, in _run_call return self._run_call_join(assign, lhs, rhs) File "../hpat/hiframes/dataframe_pass.py", line 1488, in _run_call_join left_on = self._get_const_or_list(left_on_var) File "../hpat/hiframes/dataframe_pass.py", line 2135, in _get_const_or_list raise ValueError(err_msg) ValueError: Failed in hpat mode pipeline (step: typed dataframe pass) None Following tests should be fixed with this commit: test_join_cat1 (hpat.tests.test_join.TestJoin) test_join_cat2 (hpat.tests.test_join.TestJoin) test_join_cat_parallel1 (hpat.tests.test_join.TestJoin) test_join_datetime_seq1 (hpat.tests.test_join.TestJoin) test_join_left_seq1 (hpat.tests.test_join.TestJoin) test_join_left_seq2 (hpat.tests.test_join.TestJoin) test_join_outer_seq1 (hpat.tests.test_join.TestJoin) test_join_right_seq1 (hpat.tests.test_join.TestJoin) test_merge_asof_seq1 (hpat.tests.test_join.TestJoin) * [STL] PEP8 code style for 'test_strings.py', 'test_utils.py', 'test_series.py' (#97) * pep8 style for 'test_strings.py'; flake8 check successful * pep8 style for 'test_utils.py' * pep8 style for 'test_series.py'; more readable * fixed 'test_string_series' * removed extra white spaces * deleted mention of flake8 * trigger build * Code style change part 2 (#107) * code_style_change_part_2 * Add more check in style configuration (#108) * code_style_part_3 (#109) * Fix boost runtime issue on Ubuntu16.04 with gcc 5.4 (#92) * Code style change part 4 (#110) * Revert "Code style change part 4 (#110)" This reverts commit dfc54ee. * Revert "Fix boost runtime issue on Ubuntu16.04 with gcc 5.4 (#92)" This reverts commit 231a76c. * Revert "code_style_part_3 (#109)" This reverts commit 4070ce3. * Revert "Add more check in style configuration (#108)" This reverts commit abf5bd0. * Revert "Code style change part 2 (#107)" This reverts commit 9076493. * Revert "[STL] PEP8 code style for 'test_strings.py', 'test_utils.py', 'test_series.py' (#97)" This reverts commit 8641f7a. * Revert "Fix for pandas.merge wrong overload handling of 'on' args (#99)" This reverts commit a2a8ee5. * Revert "Style check config fo pystyle (#105)" This reverts commit 551c0e3. * Revert "Code style change part 1 (#106)" This reverts commit 6dae0b3. * Revert "HPAT Build: Code style check for C and Python sources (#103)" This reverts commit 1a30e4f. * Revert "explicitly adding data-file (#104)" This reverts commit 34a2260. * Revert "Remove spark dependency (#102)" This reverts commit 9e77fde. * Unskip passing dataframe tests Actually, following tests in test_dataframe.py are pass and can be unskipped: test_create1 test_len1 test_column_getitem1 test_df_apply test_df_apply_branch test_df_describe test_count1 test_append1 At the same time, test_sort_parallel and test_sort_parallel_single_col has some problems with __pycache__: - They are passed if execute the suite with -B - They are passed if execute them separate - They and failed when some tests above are unskipped. So, decide to skip them.

* Remove spark dependency (#102) Remove spark dependency from HPA; use pre-generated sdf_dt.pq * explicitly adding data-file (#104) * HPAT Build: Code style check for C and Python sources (#103) * HPAT Build: Code style check for C and Python sources * PR103. Comments partially addressed * Code style change part 1 (#106) * Style check config fo pystyle (#105) * Fix for pandas.merge wrong overload handling of 'on' args (#99) Problem description: merge_overload and merge_asof_overload functions use 'on' argument value to compute 'left_on' and 'right_on' arguments in a way that breaks type stability, causing compilation failure when 'on' is assigned a StringLiteral value. Error: File "../hpat/hiframes/dataframe_pass.py", line 202, in _run_assign return self._run_call(assign, lhs, rhs) File "../hpat/hiframes/dataframe_pass.py", line 522, in _run_call return self._run_call_join(assign, lhs, rhs) File "../hpat/hiframes/dataframe_pass.py", line 1488, in _run_call_join left_on = self._get_const_or_list(left_on_var) File "../hpat/hiframes/dataframe_pass.py", line 2135, in _get_const_or_list raise ValueError(err_msg) ValueError: Failed in hpat mode pipeline (step: typed dataframe pass) None Following tests should be fixed with this commit: test_join_cat1 (hpat.tests.test_join.TestJoin) test_join_cat2 (hpat.tests.test_join.TestJoin) test_join_cat_parallel1 (hpat.tests.test_join.TestJoin) test_join_datetime_seq1 (hpat.tests.test_join.TestJoin) test_join_left_seq1 (hpat.tests.test_join.TestJoin) test_join_left_seq2 (hpat.tests.test_join.TestJoin) test_join_outer_seq1 (hpat.tests.test_join.TestJoin) test_join_right_seq1 (hpat.tests.test_join.TestJoin) test_merge_asof_seq1 (hpat.tests.test_join.TestJoin) * [STL] PEP8 code style for 'test_strings.py', 'test_utils.py', 'test_series.py' (#97) * pep8 style for 'test_strings.py'; flake8 check successful * pep8 style for 'test_utils.py' * pep8 style for 'test_series.py'; more readable * fixed 'test_string_series' * removed extra white spaces * deleted mention of flake8 * trigger build * Code style change part 2 (#107) * code_style_change_part_2 * Add more check in style configuration (#108) * code_style_part_3 (#109) * Fix boost runtime issue on Ubuntu16.04 with gcc 5.4 (#92) * Code style change part 4 (#110) * Cahnge tests execution Actually test suite should be executed via hpat.runtests: python -u -m hpat.runtests -v This resolve the issue with doulbe test suite execution which occurs due to the "python -u -m unittest -v" command import all files in tree including runtests.py and runtests.py triggers 1-st suite execution. Then unittest triggers 2-d. Add decorator to execute some tests (mostly parallel) 2 or more times (depending on existance of REPEAT_TEST_NUMBER environment variable) This should highlight issues like the test fails if executed twice because is corrupts memory during first execution (like test_series_head_index_parallel1) Skip test_series_head_index_parallel1 because it triggers memory corruption. This should be fixed. * Revert "Code style change part 4 (#110)" This reverts commit dfc54ee. * Revert "Fix boost runtime issue on Ubuntu16.04 with gcc 5.4 (#92)" This reverts commit 231a76c. * Revert "code_style_part_3 (#109)" This reverts commit 4070ce3. * Revert "Add more check in style configuration (#108)" This reverts commit abf5bd0. * Revert "Code style change part 2 (#107)" This reverts commit 9076493. * Revert "[STL] PEP8 code style for 'test_strings.py', 'test_utils.py', 'test_series.py' (#97)" This reverts commit 8641f7a. * Revert "Fix for pandas.merge wrong overload handling of 'on' args (#99)" This reverts commit a2a8ee5. * Revert "Style check config fo pystyle (#105)" This reverts commit 551c0e3. * Revert "Code style change part 1 (#106)" This reverts commit 6dae0b3. * Revert "HPAT Build: Code style check for C and Python sources (#103)" This reverts commit 1a30e4f. * Revert "explicitly adding data-file (#104)" This reverts commit 34a2260. * Revert "Remove spark dependency (#102)" This reverts commit 9e77fde. * Wrap functions to be executed twice in runtests.py * Update runtests.py Execute every test specified times, which is set via the REPEAT_TEST_NUMBER environment variable. Skip test_series_list_str_unbox1 because is fails on the second launch with Segmentation fault * Apply comments from review Rename REPEAT_TEST_NUMBER to HPAT_REPEAT_TEST_NUMBER Use os.getenv to get value for HPAT_REPEAT_TEST_NUMBER

…n#99) Problem description: merge_overload and merge_asof_overload functions use 'on' argument value to compute 'left_on' and 'right_on' arguments in a way that breaks type stability, causing compilation failure when 'on' is assigned a StringLiteral value. Error: File "../hpat/hiframes/dataframe_pass.py", line 202, in _run_assign return self._run_call(assign, lhs, rhs) File "../hpat/hiframes/dataframe_pass.py", line 522, in _run_call return self._run_call_join(assign, lhs, rhs) File "../hpat/hiframes/dataframe_pass.py", line 1488, in _run_call_join left_on = self._get_const_or_list(left_on_var) File "../hpat/hiframes/dataframe_pass.py", line 2135, in _get_const_or_list raise ValueError(err_msg) ValueError: Failed in hpat mode pipeline (step: typed dataframe pass) None Following tests should be fixed with this commit: test_join_cat1 (hpat.tests.test_join.TestJoin) test_join_cat2 (hpat.tests.test_join.TestJoin) test_join_cat_parallel1 (hpat.tests.test_join.TestJoin) test_join_datetime_seq1 (hpat.tests.test_join.TestJoin) test_join_left_seq1 (hpat.tests.test_join.TestJoin) test_join_left_seq2 (hpat.tests.test_join.TestJoin) test_join_outer_seq1 (hpat.tests.test_join.TestJoin) test_join_right_seq1 (hpat.tests.test_join.TestJoin) test_merge_asof_seq1 (hpat.tests.test_join.TestJoin)

* Remove spark dependency (IntelPython#102) Remove spark dependency from HPA; use pre-generated sdf_dt.pq * explicitly adding data-file (IntelPython#104) * HPAT Build: Code style check for C and Python sources (IntelPython#103) * HPAT Build: Code style check for C and Python sources * PR103. Comments partially addressed * Code style change part 1 (IntelPython#106) * Style check config fo pystyle (IntelPython#105) * Fix for pandas.merge wrong overload handling of 'on' args (IntelPython#99) Problem description: merge_overload and merge_asof_overload functions use 'on' argument value to compute 'left_on' and 'right_on' arguments in a way that breaks type stability, causing compilation failure when 'on' is assigned a StringLiteral value. Error: File "../hpat/hiframes/dataframe_pass.py", line 202, in _run_assign return self._run_call(assign, lhs, rhs) File "../hpat/hiframes/dataframe_pass.py", line 522, in _run_call return self._run_call_join(assign, lhs, rhs) File "../hpat/hiframes/dataframe_pass.py", line 1488, in _run_call_join left_on = self._get_const_or_list(left_on_var) File "../hpat/hiframes/dataframe_pass.py", line 2135, in _get_const_or_list raise ValueError(err_msg) ValueError: Failed in hpat mode pipeline (step: typed dataframe pass) None Following tests should be fixed with this commit: test_join_cat1 (hpat.tests.test_join.TestJoin) test_join_cat2 (hpat.tests.test_join.TestJoin) test_join_cat_parallel1 (hpat.tests.test_join.TestJoin) test_join_datetime_seq1 (hpat.tests.test_join.TestJoin) test_join_left_seq1 (hpat.tests.test_join.TestJoin) test_join_left_seq2 (hpat.tests.test_join.TestJoin) test_join_outer_seq1 (hpat.tests.test_join.TestJoin) test_join_right_seq1 (hpat.tests.test_join.TestJoin) test_merge_asof_seq1 (hpat.tests.test_join.TestJoin) * [STL] PEP8 code style for 'test_strings.py', 'test_utils.py', 'test_series.py' (IntelPython#97) * pep8 style for 'test_strings.py'; flake8 check successful * pep8 style for 'test_utils.py' * pep8 style for 'test_series.py'; more readable * fixed 'test_string_series' * removed extra white spaces * deleted mention of flake8 * trigger build * Code style change part 2 (IntelPython#107) * code_style_change_part_2 * Add more check in style configuration (IntelPython#108) * code_style_part_3 (IntelPython#109) * Fix boost runtime issue on Ubuntu16.04 with gcc 5.4 (IntelPython#92) * Code style change part 4 (IntelPython#110) * Revert "Code style change part 4 (IntelPython#110)" This reverts commit dfc54ee. * Revert "Fix boost runtime issue on Ubuntu16.04 with gcc 5.4 (IntelPython#92)" This reverts commit 231a76c. * Revert "code_style_part_3 (IntelPython#109)" This reverts commit 4070ce3. * Revert "Add more check in style configuration (IntelPython#108)" This reverts commit abf5bd0. * Revert "Code style change part 2 (IntelPython#107)" This reverts commit 9076493. * Revert "[STL] PEP8 code style for 'test_strings.py', 'test_utils.py', 'test_series.py' (IntelPython#97)" This reverts commit 8641f7a. * Revert "Fix for pandas.merge wrong overload handling of 'on' args (IntelPython#99)" This reverts commit a2a8ee5. * Revert "Style check config fo pystyle (IntelPython#105)" This reverts commit 551c0e3. * Revert "Code style change part 1 (IntelPython#106)" This reverts commit 6dae0b3. * Revert "HPAT Build: Code style check for C and Python sources (IntelPython#103)" This reverts commit 1a30e4f. * Revert "explicitly adding data-file (IntelPython#104)" This reverts commit 34a2260. * Revert "Remove spark dependency (IntelPython#102)" This reverts commit 9e77fde. * Unskip passing dataframe tests Actually, following tests in test_dataframe.py are pass and can be unskipped: test_create1 test_len1 test_column_getitem1 test_df_apply test_df_apply_branch test_df_describe test_count1 test_append1 At the same time, test_sort_parallel and test_sort_parallel_single_col has some problems with __pycache__: - They are passed if execute the suite with -B - They are passed if execute them separate - They and failed when some tests above are unskipped. So, decide to skip them.

* Remove spark dependency (IntelPython#102) Remove spark dependency from HPA; use pre-generated sdf_dt.pq * explicitly adding data-file (IntelPython#104) * HPAT Build: Code style check for C and Python sources (IntelPython#103) * HPAT Build: Code style check for C and Python sources * PR103. Comments partially addressed * Code style change part 1 (IntelPython#106) * Style check config fo pystyle (IntelPython#105) * Fix for pandas.merge wrong overload handling of 'on' args (IntelPython#99) Problem description: merge_overload and merge_asof_overload functions use 'on' argument value to compute 'left_on' and 'right_on' arguments in a way that breaks type stability, causing compilation failure when 'on' is assigned a StringLiteral value. Error: File "../hpat/hiframes/dataframe_pass.py", line 202, in _run_assign return self._run_call(assign, lhs, rhs) File "../hpat/hiframes/dataframe_pass.py", line 522, in _run_call return self._run_call_join(assign, lhs, rhs) File "../hpat/hiframes/dataframe_pass.py", line 1488, in _run_call_join left_on = self._get_const_or_list(left_on_var) File "../hpat/hiframes/dataframe_pass.py", line 2135, in _get_const_or_list raise ValueError(err_msg) ValueError: Failed in hpat mode pipeline (step: typed dataframe pass) None Following tests should be fixed with this commit: test_join_cat1 (hpat.tests.test_join.TestJoin) test_join_cat2 (hpat.tests.test_join.TestJoin) test_join_cat_parallel1 (hpat.tests.test_join.TestJoin) test_join_datetime_seq1 (hpat.tests.test_join.TestJoin) test_join_left_seq1 (hpat.tests.test_join.TestJoin) test_join_left_seq2 (hpat.tests.test_join.TestJoin) test_join_outer_seq1 (hpat.tests.test_join.TestJoin) test_join_right_seq1 (hpat.tests.test_join.TestJoin) test_merge_asof_seq1 (hpat.tests.test_join.TestJoin) * [STL] PEP8 code style for 'test_strings.py', 'test_utils.py', 'test_series.py' (IntelPython#97) * pep8 style for 'test_strings.py'; flake8 check successful * pep8 style for 'test_utils.py' * pep8 style for 'test_series.py'; more readable * fixed 'test_string_series' * removed extra white spaces * deleted mention of flake8 * trigger build * Code style change part 2 (IntelPython#107) * code_style_change_part_2 * Add more check in style configuration (IntelPython#108) * code_style_part_3 (IntelPython#109) * Fix boost runtime issue on Ubuntu16.04 with gcc 5.4 (IntelPython#92) * Code style change part 4 (IntelPython#110) * Cahnge tests execution Actually test suite should be executed via hpat.runtests: python -u -m hpat.runtests -v This resolve the issue with doulbe test suite execution which occurs due to the "python -u -m unittest -v" command import all files in tree including runtests.py and runtests.py triggers 1-st suite execution. Then unittest triggers 2-d. Add decorator to execute some tests (mostly parallel) 2 or more times (depending on existance of REPEAT_TEST_NUMBER environment variable) This should highlight issues like the test fails if executed twice because is corrupts memory during first execution (like test_series_head_index_parallel1) Skip test_series_head_index_parallel1 because it triggers memory corruption. This should be fixed. * Revert "Code style change part 4 (IntelPython#110)" This reverts commit dfc54ee. * Revert "Fix boost runtime issue on Ubuntu16.04 with gcc 5.4 (IntelPython#92)" This reverts commit 231a76c. * Revert "code_style_part_3 (IntelPython#109)" This reverts commit 4070ce3. * Revert "Add more check in style configuration (IntelPython#108)" This reverts commit abf5bd0. * Revert "Code style change part 2 (IntelPython#107)" This reverts commit 9076493. * Revert "[STL] PEP8 code style for 'test_strings.py', 'test_utils.py', 'test_series.py' (IntelPython#97)" This reverts commit 8641f7a. * Revert "Fix for pandas.merge wrong overload handling of 'on' args (IntelPython#99)" This reverts commit a2a8ee5. * Revert "Style check config fo pystyle (IntelPython#105)" This reverts commit 551c0e3. * Revert "Code style change part 1 (IntelPython#106)" This reverts commit 6dae0b3. * Revert "HPAT Build: Code style check for C and Python sources (IntelPython#103)" This reverts commit 1a30e4f. * Revert "explicitly adding data-file (IntelPython#104)" This reverts commit 34a2260. * Revert "Remove spark dependency (IntelPython#102)" This reverts commit 9e77fde. * Wrap functions to be executed twice in runtests.py * Update runtests.py Execute every test specified times, which is set via the REPEAT_TEST_NUMBER environment variable. Skip test_series_list_str_unbox1 because is fails on the second launch with Segmentation fault * Apply comments from review Rename REPEAT_TEST_NUMBER to HPAT_REPEAT_TEST_NUMBER Use os.getenv to get value for HPAT_REPEAT_TEST_NUMBER

kozlov-alexey added 3 commits July 18, 2019 16:18

Merge pull request #1 from IntelPython/master

793d66e

[BUG] Fixed problems with generation parquet files (IntelPython#93)

Merge pull request #2 from IntelPython/master

20b9b6f

Fixing issue with named series handling in fillna (IntelPython#95)

Merge pull request #3 from IntelPython/master

6640831

Merge from public

shssf reviewed Jul 26, 2019

View reviewed changes

kozlov-alexey force-pushed the feature/fix_merge_handle_on_args branch from ca99c6c to 571f3f2 Compare August 1, 2019 14:29

Merge pull request #4 from IntelPython/master

cd74df0

Merging commits from public repo

kozlov-alexey force-pushed the feature/fix_merge_handle_on_args branch from 571f3f2 to 2d189c1 Compare August 1, 2019 14:31

fschlimb approved these changes Aug 1, 2019

View reviewed changes

shssf approved these changes Aug 1, 2019

View reviewed changes

kozlov-alexey force-pushed the feature/fix_merge_handle_on_args branch 2 times, most recently from abc7b7b to f3d5729 Compare August 1, 2019 16:54

kozlov-alexey force-pushed the feature/fix_merge_handle_on_args branch from f3d5729 to 1bba528 Compare August 1, 2019 18:06

fschlimb approved these changes Aug 1, 2019

View reviewed changes

shssf approved these changes Aug 1, 2019

View reviewed changes

shssf added the FullyApproved Mark PR if all required approval exists label Aug 1, 2019

kozlov-alexey force-pushed the feature/fix_merge_handle_on_args branch from 5c5f347 to 7288c62 Compare August 2, 2019 09:37

Merge branch 'master' into feature/fix_merge_handle_on_args

76d7cec

shssf merged commit 91e55d8 into IntelPython:master Aug 2, 2019

Vyacheslav-Smirnov added a commit to Vyacheslav-Smirnov/sdc that referenced this pull request Aug 7, 2019

Revert "Fix for pandas.merge wrong overload handling of 'on' args (In…

c7c5db1

…telPython#99)" This reverts commit a2a8ee5.

Vyacheslav-Smirnov added a commit to Vyacheslav-Smirnov/sdc that referenced this pull request Aug 7, 2019

Revert "Fix for pandas.merge wrong overload handling of 'on' args (In…

840dab2

…telPython#99)" This reverts commit a2a8ee5.

kozlov-alexey deleted the feature/fix_merge_handle_on_args branch October 4, 2019 14:47

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix for Pandas merge overload wrong handling of string literals #99

Fix for Pandas merge overload wrong handling of string literals #99

Uh oh!

kozlov-alexey commented Jul 26, 2019

Uh oh!

shssf Jul 26, 2019 •

edited

Loading

Uh oh!

kozlov-alexey Jul 26, 2019

Uh oh!

fschlimb commented Jul 29, 2019

Uh oh!

kozlov-alexey commented Jul 29, 2019

Uh oh!

fschlimb commented Jul 29, 2019

Uh oh!

kozlov-alexey commented Jul 29, 2019

Uh oh!

kozlov-alexey commented Aug 1, 2019

Uh oh!

fschlimb left a comment

Uh oh!

kozlov-alexey commented Aug 1, 2019

Uh oh!

Uh oh!

Fix for Pandas merge overload wrong handling of string literals #99

Fix for Pandas merge overload wrong handling of string literals #99

Uh oh!

Conversation

kozlov-alexey commented Jul 26, 2019

Uh oh!

shssf Jul 26, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kozlov-alexey Jul 26, 2019

Choose a reason for hiding this comment

Uh oh!

fschlimb commented Jul 29, 2019

Uh oh!

kozlov-alexey commented Jul 29, 2019

Uh oh!

fschlimb commented Jul 29, 2019

Uh oh!

kozlov-alexey commented Jul 29, 2019

Uh oh!

kozlov-alexey commented Aug 1, 2019

Uh oh!

fschlimb left a comment

Choose a reason for hiding this comment

Uh oh!

kozlov-alexey commented Aug 1, 2019

Uh oh!

Uh oh!

shssf Jul 26, 2019 •

edited

Loading