ENH: Add Series.str.casefold #25419

charlesdong1991 · 2019-02-23T19:41:19Z

closes Series.str.casefold #25405
tests added / passed
passes git diff upstream/master -u -- "*.py" | flake8 --diff
whatsnew entry

pandas/tests/test_strings.py

codecov · 2019-02-23T20:04:40Z

Codecov Report

Merging #25419 into master will decrease coverage by 50.03%.
The diff coverage is 100%.

@@             Coverage Diff             @@
##           master   #25419       +/-   ##
===========================================
- Coverage   91.73%    41.7%   -50.04%     
===========================================
  Files         173      173               
  Lines       52848    52850        +2     
===========================================
- Hits        48482    22040    -26442     
- Misses       4366    30810    +26444

Flag	Coverage Δ
#multiple	`?`
#single	`41.7% <100%> (ø)`	⬆️

Impacted Files	Coverage Δ
pandas/core/strings.py	`33.19% <100%> (-65.4%)`	⬇️
pandas/io/formats/latex.py	`0% <0%> (-100%)`	⬇️
pandas/core/categorical.py	`0% <0%> (-100%)`	⬇️
pandas/io/sas/sas_constants.py	`0% <0%> (-100%)`	⬇️
pandas/tseries/plotting.py	`0% <0%> (-100%)`	⬇️
pandas/tseries/converter.py	`0% <0%> (-100%)`	⬇️
pandas/io/formats/html.py	`0% <0%> (-99.35%)`	⬇️
pandas/core/groupby/categorical.py	`0% <0%> (-95.46%)`	⬇️
pandas/io/sas/sas7bdat.py	`0% <0%> (-91.17%)`	⬇️
pandas/io/sas/sas_xport.py	`0% <0%> (-90.15%)`	⬇️
... and 132 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 15d8178...6608c25. Read the comment docs.

codecov · 2019-02-23T20:04:41Z

Codecov Report

Merging #25419 into master will increase coverage by 0.01%.
The diff coverage is 100%.

@@            Coverage Diff             @@
##           master   #25419      +/-   ##
==========================================
+ Coverage   91.74%   91.75%   +0.01%     
==========================================
  Files         173      173              
  Lines       52923    52955      +32     
==========================================
+ Hits        48554    48589      +35     
+ Misses       4369     4366       -3

Flag	Coverage Δ
#multiple	`90.32% <100%> (ø)`	⬆️
#single	`41.72% <100%> (-0.02%)`	⬇️

Impacted Files	Coverage Δ
pandas/core/strings.py	`98.59% <100%> (ø)`	⬆️
pandas/core/frame.py	`96.85% <0%> (ø)`	⬆️
pandas/core/indexes/category.py	`98.61% <0%> (ø)`	⬆️
pandas/core/groupby/generic.py	`87% <0%> (+0.01%)`	⬆️
pandas/io/json/json.py	`93.22% <0%> (+0.08%)`	⬆️
pandas/util/testing.py	`87.66% <0%> (+0.09%)`	⬆️
pandas/core/groupby/groupby.py	`97.2% <0%> (+0.4%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 85572de...22717a1. Read the comment docs.

gfyoung

cc @jreback

jreback · 2019-02-23T20:41:18Z

needs a mention in text.rst (also there is a list of functions ) and in the references
and a whatsnew note

jreback

see comments

charlesdong1991 · 2019-02-23T21:06:17Z

Thanks, @jreback I added it to text.rst as well as to reference/series.rst, I am not very sure which whatsnew note you refer to, I already added one in 0.25.0.rst, please let me know if you want it to move somewhere else.

pep8speaks · 2019-02-23T21:17:11Z

Hello @charlesdong1991! Thanks for updating the PR.

Cheers ! There are no PEP8 issues in this Pull Request. 🍻

Comment last updated on February 28, 2019 at 19:59 Hours UTC

jreback · 2019-02-24T03:24:39Z

doc/source/whatsnew/v0.25.0.rst

@@ -22,8 +22,8 @@ Other Enhancements
 - Indexing of ``DataFrame`` and ``Series`` now accepts zerodim ``np.ndarray`` (:issue:`24919`)
 - :meth:`Timestamp.replace` now supports the ``fold`` argument to disambiguate DST transition times (:issue:`25017`)
 - :meth:`DataFrame.at_time` and :meth:`Series.at_time` now support :meth:`datetime.time` objects with timezones (:issue:`24043`)
+- Add ``casefold`` to ``Series.str`` (:issue:`25405`)


Series.str has gained the :meth:`Series.str.casefold` method to ......(fill this in)

jreback · 2019-02-24T03:25:37Z

pandas/core/strings.py

@@ -2995,6 +2996,7 @@ def rindex(self, sub, start=0, end=None):
    _shared_docs['capitalize'] = dict(type='be capitalized',
                                      method='capitalize')
    _shared_docs['swapcase'] = dict(type='be swapcased', method='swapcase')
+    _shared_docs['casefold'] = dict(type='be casefolded', method='casefold')


can you add the versionadded 0.25.0 here (may need to add it to the dict to be formatted)

thanks, @jreback added!

can u verify this renders ok in the terminal

ahh, yeah, I printed the docstring, looks fine for me. @jreback

Convert strings in the Series/Index to be casefolded. Equivalent to :meth:`str.casefold`. Returns ------- Series/Index of objects See Also Series.str.lower : Converts all characters to lowercase. Series.str.upper : Converts all characters to uppercase. Series.str.title : Converts first character of each word to uppercase and remaining to lowercase. Series.str.capitalize : Converts first character to uppercase and remaining to lowercase. Series.str.swapcase : Converts uppercase to lowercase and lowercase to uppercase Series.str.casefold: Removes all case distinctions in the string. .. versionadded:: 0.25.0

Does this render well on the web? Not aware of any other instance where the version added is in the See Also section. May be worth messing around with substitution to put it in the Summary

this rendering looks fine for me in the terminal. Since the summary of rendering only allows two parameters, and among these methods, only this Series.str.casefold is added in the new version. So i put this in the See also under Series.str.casefold. Otherwise, i assume the rendering might need to be changed.

You should be able to generate one HTML file rather easily:

https://python-sprints.github.io/pandas/guide/pandas_pr.html#visual-validation-of-the-docstring

If you can double check there would be preferable as majority of users will interface to docs via HTML.

sorry, do you have any idea why it's complaining AttributeError: type object 'StringMethods' has no attribute 'casefold'? is this quoting master? @WillAyd

btw, i tried something like below to put version added in the summary, but when i kind of manually inserting a blank line for other methods since they don't have version added issue. I would like to hear your advice on how to improve this? @WillAyd

_shared_docs['casemethods'] = (""" Convert strings in the Series/Index to %(type)s. Equivalent to :meth:`str.%(method)s`. %(version)s ........ _shared_docs['casefold'] = dict(type='be casefolded', method='casefold', version='.. versionadded:: 0.25.0')

What are you running to get the AttributeError? As far as the second comment, probably easiest to add a newline(s) into the version argument

jreback

lgtm with some minor comments ping on green

jreback · 2019-02-24T07:51:55Z

doc/source/user_guide/text.rst

@@ -618,3 +618,4 @@ Method Summary
    :meth:`~Series.str.istitle`;Equivalent to ``str.istitle``
    :meth:`~Series.str.isnumeric`;Equivalent to ``str.isnumeric``
    :meth:`~Series.str.isdecimal`;Equivalent to ``str.isdecimal``
+    :meth:`~Series.str.casefold`;Equivalent to ``str.casefold``


not sure if these are ordered here - if they r alphabetic then this is not in the right place; otherwise maybe put next to lower

ok, the methods in series.rst is alphabetically ordered, here is not. But I will move it up a bit!

jreback · 2019-02-24T07:52:51Z

pandas/core/strings.py

@@ -2995,6 +2996,7 @@ def rindex(self, sub, start=0, end=None):
    _shared_docs['capitalize'] = dict(type='be capitalized',
                                      method='capitalize')
    _shared_docs['swapcase'] = dict(type='be swapcased', method='swapcase')
+    _shared_docs['casefold'] = dict(type='be casefolded', method='casefold')


can u verify this renders ok in the terminal

WillAyd

Very minor nit but otherwise lgtm

WillAyd · 2019-02-27T17:02:53Z

pandas/tests/test_strings.py

+    @pytest.mark.skipif(compat.PY2, reason='not in python2')
+    def test_casefold(self):
+        # GH25405
+        casefolded = Series(['ss', NA, 'case', 'ssd'])


Just call this expected

thanks, @WillAyd i just changed!

charlesdong1991 · 2019-02-28T14:05:54Z

i change it a bit, now it looks like, do you prefer this? @WillAyd

    Convert strings in the Series/Index to casefolded.
    .. versionadded:: 0.25.0
    Equivalent to :meth:`str.casefold`.

    Returns
    -------
    Series/Index of objects

    See Also
    --------
    Series.str.lower : Converts all characters to lowercase.
    Series.str.upper : Converts all characters to uppercase.
    Series.str.title : Converts first character of each word to uppercase and
        remaining to lowercase.
    Series.str.capitalize : Converts first character to uppercase and
        remaining to lowercase.
    Series.str.swapcase : Converts uppercase to lowercase and lowercase to
        uppercase.
    Series.str.casefold: Removes all case distinctions in the string.

jreback · 2019-02-28T14:50:25Z

lgtm. over to you @WillAyd

WillAyd · 2019-02-28T16:37:00Z

@charlesdong1991 does that render correctly in HTML? Might need a blank line before versionadded

charlesdong1991 · 2019-02-28T18:56:14Z

somehow, i still get this error when running python make.py html --single pandas.Series.str.casefold, is it because of str.casefold not in python2? @WillAyd

Traceback (most recent call last):
  File "make.py", line 79, in _process_single_doc
    obj = getattr(obj, name)
AttributeError: type object 'StringMethods' has no attribute 'casefold'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "make.py", line 339, in <module>
    sys.exit(main())
  File "make.py", line 334, in main
    args.verbosity, args.warnings_are_errors)
  File "make.py", line 46, in __init__
    single_doc = self._process_single_doc(single_doc)
  File "make.py", line 81, in _process_single_doc
    raise ImportError('Could not import {}'.format(single_doc))
ImportError: Could not import pandas.Series.str.casefold

And i was thinking to add a blank line in-between, however, for all other methods, there will be two blank lines, and i don't think it's good...

Convert strings in the Series/Index to lower.


Equivalent to :meth:`str.lower`.

WillAyd · 2019-02-28T19:45:12Z

Not sure the distinction you are trying to make with Python2 - are you actively developing with that?

Otherwise do the other methods produce something for you? I would think should work the same as the rest. As far as the blank lines go can just add a "\n" preceding the argument

charlesdong1991 · 2019-02-28T19:57:16Z

i distinguish python 2 is because str.casefold is only valid in python3+ if i am not wrong.

It's quite wired that all other methods work fine, but this new added one... and only this is valid only in python3, so i made a guess of where it might go wrong to get this error @WillAyd

And i add \n to make new lines, thanks!

WillAyd · 2019-02-28T21:21:17Z

Looks like an issue in the make file which is causing it to pick up another pandas installation on your computer which obviously wouldn't have that attribute yet. I patched locally just to make sure so here's the doc:

lgtm

WillAyd · 2019-02-28T21:21:48Z

Thanks @charlesdong1991 nice work!

* ERR/TST: Add pytest idiom to dtypes/test_cast.py (pandas-dev#24847) * fix MacPython pandas-wheels failue (pandas-dev#24851) * DEPS: Bump pyarrow min version to 0.9.0 (pandas-dev#24854) Closes pandas-devgh-24767 * DOC: Document AttributeError for accessor (pandas-dev#24855) Closes pandas-dev#20579 * Start whatsnew for 0.24.1 and 0.25.0 (pandas-dev#24848) * DEPR/API: Non-ns precision in Index constructors (pandas-dev#24806) * BUG: Format mismatch doesn't coerce to NaT (pandas-dev#24815) * BUG: Properly parse unicode usecols names in CSV (pandas-dev#24856) * CLN: fix typo in asv eval.Query suite (pandas-dev#24865) * BUG: DataFrame respects dtype with masked recarray (pandas-dev#24874) * REF/CLN: Move private method (pandas-dev#24875) * BUG : ValueError in case on NaN value in groupby columns (pandas-dev#24850) * BUG: fix floating precision formatting in presence of inf (pandas-dev#24863) * DOC: Creating top-level user guide section, and moving pages inside (pandas-dev#24677) * DOC: Creating top-level development section, and moving pages inside (pandas-dev#24691) * DOC: Creating top-level getting started section, and moving pages inside (pandas-dev#24678) * DOC: Implementing redirect system, and adding user_guide redirects (pandas-dev#24715) * DOC: Implementing redirect system, and adding user_guide redirects * Using relative urls for the redirect * Validating that no file is overwritten by a redirect * Adding redirects for getting started and development sections * DOC: fixups (pandas-dev#24888) * Fixed heading on whatnew * Remove empty scalars.rst * CLN: fix typo in ctors.SeriesDtypesConstructors setup (pandas-dev#24894) * DOC: No clean in sphinx_build (pandas-dev#24902) Closes pandas-dev#24727 * BUG (output formatting): use fixed with for truncation column instead of inferring from last column (pandas-dev#24905) * DOC: also redirect old whatsnew url (pandas-dev#24906) * Revert BUG-24212 fix usage of Index.take in pd.merge (pandas-dev#24904) * Revert BUG-24212 fix usage of Index.take in pd.merge xref pandas-dev#24733 xref pandas-dev#24897 * test 0.23.4 output * added note about buggy test * DOC: Add experimental note to DatetimeArray and TimedeltaArray (pandas-dev#24882) * DOC: Add experimental note to DatetimeArray and TimedeltaArray * Disable M8 in nanops (pandas-dev#24907) * Disable M8 in nanops Closes pandas-dev#24752 * CLN: fix typo in asv benchmark of non_unique_sorted, which was not sorted (pandas-dev#24917) * API/VIS: remove misc plotting methods from plot accessor (revert pandas-dev#23811) (pandas-dev#24912) * DOC: some 0.24.0 whatsnew clean-up (pandas-dev#24911) * DOC: Final reorganization of documentation pages (pandas-dev#24890) * DOC: Final reorganization of documentation pages * Move ecosystem to top level * DOC: Adding redirects to API moved pages (pandas-dev#24909) * DOC: Adding redirects to API moved pages * DOC: Making home page links more compact and clearer (pandas-dev#24928) * DOC: 0.24 release date (pandas-dev#24930) * DOC: Adding version to the whatsnew section in the home page (pandas-dev#24929) * API: Remove IntervalArray from top-level (pandas-dev#24926) * RLS: 0.24.0 * DEV: Start 0.25 cycle * DOC: State that we support scalars in to_numeric (pandas-dev#24944) We support it and test it already. xref pandas-devgh-24910. * DOC: Minor what's new fix (pandas-dev#24933) * TST: GH#23922 Add missing match params to pytest.raises (pandas-dev#24937) * Add tests for NaT when performing dt.to_period (pandas-dev#24921) * DOC: switch headline whatsnew to 0.25 (pandas-dev#24941) * BUG-24212 fix regression in pandas-dev#24897 (pandas-dev#24916) * CLN: reduce overhead in setup for categoricals benchmarks in asv (pandas-dev#24913) * Excel Reader Refactor - Base Class Introduction (pandas-dev#24829) * TST/REF: Add pytest idiom to test_numeric.py (pandas-dev#24946) * BLD: silence npy_no_deprecated warnings with numpy>=1.16.0 (pandas-dev#24864) * CLN: Refactor cython to use memory views (pandas-dev#24932) * DOC: Clean sort_values and sort_index docstrings (pandas-dev#24843) * STY: use pytest.raises context syntax (indexing) (pandas-dev#24960) * Fixed itertuples usage in to_dict (pandas-dev#24965) * Fixed itertuples usage in to_dict Closes pandas-dev#24940 Closes pandas-dev#24939 * STY: use pytest.raises context manager (resample) (pandas-dev#24977) * DOC: Document breaking change to read_csv (pandas-dev#24989) * DEPR: Fixed warning for implicit registration (pandas-dev#24964) * STY: use pytest.raises context manager (indexes/datetimes) (pandas-dev#24995) * DOC: move whatsnew note of pandas-dev#24916 (pandas-dev#24999) * BUG: Fix broken links (pandas-dev#25002) The previous location of contributing.rst file was /doc/source/contributing.rst but has been moved to /doc/source/development/contributing.rst * fix for BUG: grouping with tz-aware: Values falls after last bin (pandas-dev#24973) * REGR: Preserve order by default in Index.difference (pandas-dev#24967) Closes pandas-dev#24959 * CLN: do not use .repeat asv setting for storing benchmark data (pandas-dev#25015) * CLN: isort asv_bench/benchmark/algorithms.py (pandas-dev#24958) * fix+test to_timedelta('NaT', box=False) (pandas-dev#24961) * PERF: significant speedup in sparse init and ops by using numpy in check_integrity (pandas-dev#24985) * BUG: Fixed merging on tz-aware (pandas-dev#25033) * Test nested PandasArray (pandas-dev#24993) * DOC: fix error in documentation pandas-dev#24981 (pandas-dev#25038) * BUG: support dtypes in column_dtypes for to_records() (pandas-dev#24895) * Makes example from docstring work (pandas-dev#25035) * CLN: typo fixups (pandas-dev#25028) * BUG: to_datetime(strs, utc=True) used previous UTC offset (pandas-dev#25020) * BUG: Better handle larger numbers in to_numeric (pandas-dev#24956) * BUG: Better handle larger numbers in to_numeric * Warn about lossiness when passing really large numbers that exceed (u)int64 ranges. * Coerce negative numbers to float when requested instead of crashing and returning object. * Consistently parse numbers as integers / floats, even if we know that the resulting container has to be float. This is to ensure consistent error behavior when inputs numbers are too large. Closes pandas-devgh-24910. * MAINT: Address comments * BUG: avoid usage in_qtconsole for recent IPython versions (pandas-dev#25039) * Drop IPython<4.0 compat * Revert "Drop IPython<4.0 compat" This reverts commit 0cb0452. * update a * whatsnew * REGR: fix read_sql delegation for queries on MySQL/pymysql (pandas-dev#25024) * DOC: Start 0.24.2.rst (pandas-dev#25026) [ci skip] * REGR: rename_axis with None should remove axis name (pandas-dev#25069) * clarified the documentation for DF.drop_duplicates (pandas-dev#25056) * Clarification in docstring of Series.value_counts (pandas-dev#25062) * ENH: Support fold argument in Timestamp.replace (pandas-dev#25046) * CLN: to_pickle internals (pandas-dev#25044) * Implement+Test Tick.__rtruediv__ (pandas-dev#24832) * API: change Index set ops sort=True -> sort=None (pandas-dev#25063) * BUG: to_clipboard text truncated for Python 3 on Windows for UTF-16 text (pandas-dev#25040) * PERF: use new to_records() argument in to_stata() (pandas-dev#25045) * DOC: Cleanup 0.24.1 whatsnew (pandas-dev#25084) * Fix quotes position in pandas.core, typos and misspelled parameters. (pandas-dev#25093) * CLN: Remove sentinel_factory() in favor of object() (pandas-dev#25074) * TST: remove DST transition scenarios from tc pandas-dev#24689 (pandas-dev#24736) * BLD: remove spellcheck from Makefile (pandas-dev#25111) * DOC: small clean-up of 0.24.1 whatsnew (pandas-dev#25096) * DOC: small doc fix to Series.repeat (pandas-dev#25115) * TST: tests for categorical apply (pandas-dev#25095) * CLN: use dtype in constructor (pandas-dev#25098) * DOC: frame.py doctest fixing (pandas-dev#25097) * DOC: 0.24.1 release (pandas-dev#25125) [ci skip] * Revert set_index inspection/error handling for 0.24.1 (pandas-dev#25085) * DOC: Minor what's new fix (pandas-dev#24933) * Backport PR pandas-dev#24916: BUG-24212 fix regression in pandas-dev#24897 (pandas-dev#24951) * Revert "Backport PR pandas-dev#24916: BUG-24212 fix regression in pandas-dev#24897 (pandas-dev#24951)" This reverts commit 84056c5. * DOC/CLN: Timezone section in timeseries.rst (pandas-dev#24825) * DOC: Improve timezone documentation in timeseries.rst * edit some of the examples * Address review * DOC: Fix validation type error RT04 (pandas-dev#25107) (pandas-dev#25129) * Reading a HDF5 created in py2 (pandas-dev#25058) * BUG: Fixing regression in DataFrame.all and DataFrame.any with bool_only=True (pandas-dev#25102) * Removal of return variable names (pandas-dev#25123) * DOC: Improve docstring of Series.mul (pandas-dev#25136) * TST/REF: collect DataFrame reduction tests (pandas-dev#24914) * Fix validation error type `SS05` and check in CI (pandas-dev#25133) * Fixed tuple to List Conversion in Dataframe class (pandas-dev#25089) * STY: use pytest.raises context manager (indexes/multi) (pandas-dev#25175) * DOC: Updates to Timestamp document (pandas-dev#25163) * BLD: pin cython language level to '2' (pandas-dev#25145) Not explicitly pinning the language level has been producing future warnings from cython. The next release of cython is going to change the default level to '3str' under which the pandas cython extensions do not compile. The long term solution is to update the cython files to the next language level, but this is a stop-gap to keep pandas building. * CLN: Use ABCs in set_index (pandas-dev#25128) * DOC: update docstring for series.nunique (pandas-dev#25116) * DEPR: remove PanelGroupBy, disable DataFrame.to_panel (pandas-dev#25047) * BUG: DataFrame.merge(suffixes=) does not respect None (pandas-dev#24819) * fix MacPython pandas-wheels failure (pandas-dev#25186) * modernize compat imports (pandas-dev#25192) * TST: follow-up to Test nested pandas array pandas-dev#24993 (pandas-dev#25155) * revert changes to tests in pandas-devgh-24993 * Test nested PandasArray * isort test_numpy.py * change NP_VERSION_INFO * use LooseVersion * add _np_version_under1p16 * remove blank line from merge master * add doctstrings to fixtures * DOC/CLN: Fix errors in Series docstrings (pandas-dev#24945) * REF: Add more pytest idiom to test_holiday.py (pandas-dev#25204) * DOC: Fix validation type error SA05 (pandas-dev#25208) Create check for SA05 errors in CI * BUG: Fix Series.is_unique with single occurrence of NaN (pandas-dev#25182) * REF: Remove many Panel tests (pandas-dev#25191) * DOC: Fixes to docstrings and add PR10 (space before colon) to validation (pandas-dev#25109) * DOC: exclude autogenerated c/cpp/html files from 'trailing whitespace' checks (pandas-dev#24549) * STY: use pytest.raises context manager (indexes/period) (pandas-dev#25199) * fix ci failures (pandas-dev#25225) * DEPR: remove tm.makePanel and all usages (pandas-dev#25231) * DEPR: Remove Panel-specific parts of io.pytables (pandas-dev#25233) * DEPR: Add Deprecated warning for timedelta with passed units M and Y (pandas-dev#23264) * BUG-25061 fix printing indices with NaNs (pandas-dev#25202) * BUG: Fix regression in DataFrame.apply causing RecursionError (pandas-dev#25230) * BUG: Fix regression in DataFrame.apply causing RecursionError * Add feedback from PR * Add feedback after further code review * Add feedback after further code review 2 * BUG: Fix read_json orient='table' without index (pandas-dev#25170) (pandas-dev#25171) * BLD: prevent asv from calling sys.stdin.close() by using different launch method (pandas-dev#25237) * (Closes pandas-dev#25029) Removed extra bracket from cheatsheet code example. (pandas-dev#25032) * CLN: For loops, boolean conditions, misc. (pandas-dev#25206) * Refactor groupby group_add from tempita to fused types (pandas-dev#24954) * CLN: Remove ipython 2.x compat (pandas-dev#25150) * CLN: Remove ipython 2.x compat * trivial change to trigger asv * Update v0.25.0.rst * revert whatsnew * BUG: Duplicated returns boolean dataframe (pandas-dev#25234) * REF/TST: resample/test_base.py (pandas-dev#25262) * Revert "BLD: prevent asv from calling sys.stdin.close() by using different launch method (pandas-dev#25237)" (pandas-dev#25253) This reverts commit f67b7fd. * BUG: pandas Timestamp tz_localize and tz_convert do not preserve `freq` attribute (pandas-dev#25247) * DEPR: remove assert_panel_equal (pandas-dev#25238) * PR04 errors fix (pandas-dev#25157) * Split Excel IO Into Sub-Directory (pandas-dev#25153) * API: Ensure DatetimeTZDtype standardizes pytz timezones (pandas-dev#25254) * API: Ensure DatetimeTZDtype standardizes pytz timezones * Add whatsnew * BUG: Fix exceptions when Series.interpolate's `order` parameter is missing or invalid (pandas-dev#25246) * BUG: raise accurate exception from Series.interpolate (pandas-dev#24014) * Actually validate `order` before use in spline * Remove unnecessary check and dead code * Clean up comparison/tests based on feedback * Include invalid order value in exception * Check for NaN order in spline validation * Add whatsnew entry for bug fix * CLN: Make unit tests assert one error at a time * CLN: break test into distinct test case * PEP8 fix in test module * CLN: Test fixture for interpolate methods * BUG: DataFrame.join on tz-aware DatetimeIndex (pandas-dev#25260) * REF: use _constructor and ABCFoo to avoid runtime imports (pandas-dev#25272) * Refactor groupby group_prod, group_var, group_mean, group_ohlc (pandas-dev#25249) * Fix typo in Cheat sheet with regex (pandas-dev#25215) * Edit parameter type in pandas.core.frame.py DataFrame.count (pandas-dev#25198) * TST/CLN: remove test_slice_ints_with_floats_raises (pandas-dev#25277) * Removed Panel class from HDF ASVs (pandas-dev#25281) * DOC: Fix minor typo in docstring (pandas-dev#25285) * DOC/CLN: Fix errors in DataFrame docstrings (pandas-dev#24952) * Skipped broken Py2 / Windows test (pandas-dev#25323) * Rt05 documentation error fix issue 25108 (pandas-dev#25309) * Fix typos in docs (pandas-dev#25305) * Doc: corrects spelling in generic.py (pandas-dev#25333) * BUG: groupby.transform retains timezone information (pandas-dev#25264) * Fixes Formatting Exception (pandas-dev#25088) * Bug: OverflowError in resample.agg with tz data (pandas-dev#25297) * DOC/CLN: Fix various docstring errors (pandas-dev#25295) * COMPAT: alias .to_numpy() for timestamp and timedelta scalars (pandas-dev#25142) * ENH: Support times with timezones in at_time (pandas-dev#25280) * BUG: Fix passing of numeric_only argument for categorical reduce (pandas-dev#25304) * TST: use a fixed seed to have the same uniques across python versions (pandas-dev#25346) TST: add pytest-mock to handle mocker fixture * TST: xfail excel styler tests, xref GH25351 (pandas-dev#25352) * TST: xfail excel styler tests, xref GH25351 * CI: cleanup .c files for cpplint>1.4 * DOC: Correct doc mistake in combiner func (pandas-dev#25360) Closes pandas-devgh-25359. * DOC/BLD: fix --no-api option (pandas-dev#25209) * DOC: modify typos in Contributing section (pandas-dev#25365) * Remove spurious MultiIndex creation in `_set_axis_name` (pandas-dev#25371) * Resovles pandas-dev#25370 * Introduced by pandas-dev#22969 * pandas-dev#23049: test for Fatal Stack Overflow stemming From Misuse of astype('category') (pandas-dev#25366) * 9236: test for the DataFrame.groupby with MultiIndex having pd.NaT (pandas-dev#25310) * [BUG] exception handling of MultiIndex.__contains__ too narrow (pandas-dev#25268) * 14873: test for groupby.agg coercing booleans (pandas-dev#25327) * BUG/ENH: Timestamp.strptime (pandas-dev#25124) * BUG: constructor Timestamp.strptime() does not support %z. * Add doc string to NaT and Timestamp * updated the error message * Updated whatsnew entry. * Interval dtype fix (pandas-dev#25338) * [CLN] Excel Module Cleanups (pandas-dev#25275) Closes pandas-devgh-25153 Authored-By: tdamsma <tdamsma@gmail.com> * ENH: indexing and __getitem__ of dataframe and series accept zerodim integer np.array as int (pandas-dev#24924) * REGR: fix TimedeltaIndex sum and datetime subtraction with NaT (pandas-dev#25282, pandas-dev#25317) (pandas-dev#25329) * edited whatsnew typo (pandas-dev#25381) * fix typo of see also in DataFrame stat funcs (pandas-dev#25388) * API: more consistent error message for MultiIndex.from_arrays (pandas-dev#25189) * CLN: (re-)enable infer_dtype to catch complex (pandas-dev#25382) * DOC: Edited docstring of Interval (pandas-dev#25410) The docstring contained a repeated segment, which I removed. * Mark test_pct_max_many_rows as high memory (pandas-dev#25400) Fixes issue pandas-dev#25384 * Correct a typo of version number for interpolate() (pandas-dev#25418) * DEP: add pytest-mock to environment.yml (pandas-dev#25417) * BUG: Fix type coercion in read_json orient='table' (pandas-dev#21345) (pandas-dev#25219) * ERR: doc update for ParsingError (pandas-dev#25414) Closes pandas-devgh-22881 * ENH: Add in sort keyword to DatetimeIndex.union (pandas-dev#25110) * DOC: Rewriting of ParserError doc + minor spacing (pandas-dev#25421) Follow-up to pandas-devgh-25414. * API/ERR: allow iterators in df.set_index & improve errors (pandas-dev#24984) * BUG: Indexing with UTC offset string no longer ignored (pandas-dev#25263) * PERF/REF: improve performance of Series.searchsorted, PandasArray.searchsorted, collect functionality (pandas-dev#22034) * TST: remove never-used singleton fixtures (pandas-dev#24885) * BUG: fixed merging with empty frame containing an Int64 column (pandas-dev#25183) (pandas-dev#25289) * DOC: fixed geo accessor example in extending.rst (pandas-dev#25420) I realised "lon" and "lat" had just been switched with "longitude" and "latitude" in the following code block. So I used those names here as well. * TST: numpy RuntimeWarning with Series.round() (pandas-dev#25432) * CI: add __init__.py to isort skip list (pandas-dev#25455) * DOC: CategoricalIndex doc string (pandas-dev#24852) * DataFrame.drop Raises KeyError definition (pandas-dev#25474) * BUG: Keep column level name in resample nunique (pandas-dev#25469) Closes pandas-devgh-23222 xref pandas-devgh-23645 * ERR: Correct error message in to_datetime (pandas-dev#25467) * ERR: Correct error message in to_datetime Closes pandas-devgh-23830 xref pandas-devgh-23969 * Fix minor typo (pandas-dev#25458) Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com> * CI: Set pytest minversion to 4.0.2 (pandas-dev#25402) * CI: Set pytest minversion to 4.0.2 * STY: use pytest.raises context manager (indexes) (pandas-dev#25447) * STY: use pytest.raises context manager (tests/test_*) (pandas-dev#25452) * STY: use pytest.raises context manager (tests/test_*) * fix ci failures * skip py2 ci failure * Fix minor error in dynamic load function (pandas-dev#25256) * Cythonized GroupBy Quantile (pandas-dev#20405) * BUG: Fix regression on DataFrame.replace for regex (pandas-dev#25266) * BUG: Fix regression on DataFrame.replace for regex The commit ensures that the replacement for regex is not confined to the beginning of the string but spans all the characters within. The behaviour is then consistent with versions prior to 0.24.0. One test has been added to account for character replacement when the character is not at the beginning of the string. * Correct contribution guide docbuild instruction (pandas-dev#25479) * TST/REF: Add pytest idiom to test_frequencies.py (pandas-dev#25430) * BUG: Fix index type casting in read_json with orient='table' and float index (pandas-dev#25433) (pandas-dev#25434) * BUG: Groupby.agg with reduction function with tz aware data (pandas-dev#25308) * BUG: Groupby.agg cannot reduce with tz aware data * Handle output always as UTC * Add whatsnew * isort and add another fixed groupby.first/last issue * bring condition at a higher level * Add try for _try_cast * Add comments * Don't pass the utc_dtype explicitly * Remove unused import * Use string dtype instead * DOC: Fix docstring for read_sql_table (pandas-dev#25465) * ENH: Add Series.str.casefold (pandas-dev#25419) * Fix PR10 error and Clean up docstrings from functions related to RT05 errors (pandas-dev#25132) * Fix unreliable test (pandas-dev#25496) * DOC: Clarifying doc/make.py --single parameter (pandas-dev#25482) * fix MacPython / pandas-wheels ci failures (pandas-dev#25505) * DOC: Reword Series.interpolate docstring for clarity (pandas-dev#25491) * Changed insertion order to sys.path (pandas-dev#25486) * TST: xfail non-writeable pytables tests with numpy 1.16x (pandas-dev#25517) * STY: use pytest.raises context manager (arithmetic, arrays, computati… (pandas-dev#25504) * BUG: Fix RecursionError during IntervalTree construction (pandas-dev#25498) * STY: use pytest.raises context manager (plotting, reductions, scalar...) (pandas-dev#25483) * STY: use pytest.raises context manager (plotting, reductions, scalar...) * revert removed testing in test_timedelta.py * remove TODO from test_frame.py * skip py2 ci failure * BUG: Fix potential segfault after pd.Categorical(pd.Series(...), categories=...) (pandas-dev#25368) * Make DataFrame.to_html output full content (pandas-dev#24841) * BUG-16807-1 SparseFrame fills with default_fill_value if data is None (pandas-dev#24842) Closes pandas-devgh-16807. * DOC: Add conda uninstall pandas to contributing guide (pandas-dev#25490) * fix pandas-dev#25487 add modify documentation * fix segfault when running with cython coverage enabled, xref cython#2879 (pandas-dev#25529) * TST: inline empty_frame = DataFrame({}) fixture (pandas-dev#24886) * DOC: Polishing typos out of doc/source/user_guide/indexing.rst (pandas-dev#25528) * STY: use pytest.raises context manager (frame) (pandas-dev#25516) * DOC: Fix pandas-dev#24268 by updating description for keep in Series.nlargest (pandas-dev#25358) * DOC: Fix pandas-dev#24268 by updating description for keep * fix MacPython / pandas-wheels ci failures (pandas-dev#25537) * TST/CLN: Remove more Panel tests (pandas-dev#25550) * BUG: caught typeError in series.at (pandas-dev#25506) (pandas-dev#25533) * ENH: Add errors parameter to DataFrame.rename (pandas-dev#25535) * ENH: GH13473 Add errors parameter to DataFrame.rename * TST: Skip IntervalTree construction overflow test on 32bit (pandas-dev#25558) * DOC: Small fixes to 0.24.2 whatsnew (pandas-dev#25559) * minor typo error (pandas-dev#25574) * BUG: in error message raised when invalid axis parameter (pandas-dev#25553) * BLD: Fixed pip install with no numpy (pandas-dev#25568) * Document the behavior of `axis=None` with `style.background_gradient` (pandas-dev#25551) * fix minor typos in dsintro.rst (pandas-dev#25579) * BUG: Handle readonly arrays in period_array (pandas-dev#25556) * BUG: Handle readonly arrays in period_array Closes pandas-dev#25403 * DOC: Fix typo in tz_localize (pandas-dev#25598) * BUG: secondary y axis could not be set to log scale (pandas-dev#25545) (pandas-dev#25586) * TST: add test for groupby on list of empty list (pandas-dev#25589) * TYPING: Small fixes to make stubgen happy (pandas-dev#25576) * CLN: Parmeterize test cases (pandas-dev#25355)

collect updated master

9b8fed6

charlesdong1991 force-pushed the issue_25405 branch from fb8c88e to 9b8fed6 Compare February 23, 2019 19:43

charlesdong1991 added 2 commits February 23, 2019 20:46

add whatsnew

c0d067d

small change

bfb3fa8

gfyoung added Enhancement Strings String extension data type and string data labels Feb 23, 2019

gfyoung reviewed Feb 23, 2019

View reviewed changes

pandas/tests/test_strings.py Outdated Show resolved Hide resolved

gfyoung reviewed Feb 23, 2019

View reviewed changes

pandas/tests/test_strings.py Outdated Show resolved Hide resolved

remove unnecessary test

6608c25

rename

0d9ebec

gfyoung approved these changes Feb 23, 2019

View reviewed changes

jreback requested changes Feb 23, 2019

View reviewed changes

charlesdong1991 added 2 commits February 23, 2019 21:50

add series.str.casefold in reference

13b2442

add reference in text rst

3448d76

charlesdong1991 added 2 commits February 23, 2019 22:16

add skipif to avoid failure

d147075

add issue number

983332e

fix pep8

f9e52cc

jreback requested changes Feb 24, 2019

View reviewed changes

charlesdong1991 added 2 commits February 24, 2019 08:47

changes based on review

a1a8891

fix conflict

eb119d8

charlesdong1991 force-pushed the issue_25405 branch from acd0a61 to eb119d8 Compare February 24, 2019 07:49

jreback requested changes Feb 24, 2019

View reviewed changes

charlesdong1991 added 2 commits February 24, 2019 09:02

move position up

893d426

minor

522c021

WillAyd requested changes Feb 27, 2019

View reviewed changes

charlesdong1991 added 2 commits February 27, 2019 20:20

minor change on naming convention

bf35935

new args for version

bf49467

jreback added this to the 0.25.0 milestone Feb 28, 2019

jreback approved these changes Feb 28, 2019

View reviewed changes

add \n

22717a1

WillAyd approved these changes Feb 28, 2019

View reviewed changes

WillAyd merged commit db978c7 into pandas-dev:master Feb 28, 2019

WillAyd mentioned this pull request Feb 28, 2019

DOC: Use correct pandas when building documentation #25486

Merged

Uh oh!

ENH: Add Series.str.casefold #25419

ENH: Add Series.str.casefold #25419

Uh oh!

Conversation

charlesdong1991 commented Feb 23, 2019 • edited by WillAyd Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

codecov bot commented Feb 23, 2019

Codecov Report

Uh oh!

codecov bot commented Feb 23, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

gfyoung left a comment

Choose a reason for hiding this comment

Uh oh!

jreback commented Feb 23, 2019

Uh oh!

jreback left a comment

Choose a reason for hiding this comment

Uh oh!

charlesdong1991 commented Feb 23, 2019

Uh oh!

pep8speaks commented Feb 23, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Comment last updated on February 28, 2019 at 19:59 Hours UTC

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

charlesdong1991 Feb 27, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

charlesdong1991 Feb 27, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jreback left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

WillAyd left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

charlesdong1991 commented Feb 28, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jreback commented Feb 28, 2019

Uh oh!

charlesdong1991 commented Feb 23, 2019 •

edited by WillAyd

Loading

codecov bot commented Feb 23, 2019 •

edited

Loading

pep8speaks commented Feb 23, 2019 •

edited

Loading

charlesdong1991 Feb 27, 2019 •

edited

Loading

charlesdong1991 Feb 27, 2019 •

edited

Loading

charlesdong1991 commented Feb 28, 2019 •

edited

Loading

charlesdong1991 commented Feb 28, 2019 •

edited

Loading

charlesdong1991 commented Feb 28, 2019 •

edited

Loading