Skip to content

BUG: pd.read_html() is broken with beautifulsoup4 4.14.0 #62492

@Dr-Irv

Description

@Dr-Irv

Pandas version checks

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • I have confirmed this bug exists on the main branch of pandas.

Reproducible Example

import pandas as pd
DF = pd.DataFrame({"a": [1, 2, 3], "b": [0.0, 0.0, 0.0]})
DF.to_html("foo.html")
pd.read_html("foo.html", flavor=["bs4"])

Issue Description

The above code works fine with beautifulsoup4 version 4.13.5.
It breaks with beautiifulsoup4 version 4.14.0 that was released on 9/27/25

So we either need to pin beautifulsoup4 or fix the bug.

Expected Behavior

No bug with pandas 2.3.2

Installed Versions

INSTALLED VERSIONS

commit : 4665c10
python : 3.11.13
python-bits : 64
OS : Windows
OS-release : 10
Version : 10.0.26100
machine : AMD64
processor : Intel64 Family 6 Model 183 Stepping 1, GenuineIntel
byteorder : little
LC_ALL : None
LANG : None
LOCALE : English_United States.1252

pandas : 2.3.2
numpy : 2.3.3
pytz : 2025.2
dateutil : 2.9.0.post0
pip : 25.1
Cython : None
sphinx : None
IPython : None
adbc-driver-postgresql: None
adbc-driver-sqlite : None
bs4 : 4.14.0
blosc : None
bottleneck : None
dataframe-api-compat : None
fastparquet : None
fsspec : None
html5lib : 1.1
hypothesis : None
gcsfs : None
jinja2 : 3.1.6
lxml.etree : 6.0.2
matplotlib : 3.10.6
numba : None
numexpr : 2.13.0
odfpy : None
openpyxl : 3.1.5
pandas_gbq : None
psycopg2 : None
pymysql : None
pyarrow : 21.0.0
pyreadstat : 1.3.1
pytest : 8.4.2
python-calamine : None
pyxlsb : 1.0.10
s3fs : None
scipy : 1.16.2
sqlalchemy : 2.0.43
tables : 3.10.2
tabulate : 0.9.0
xarray : 2025.9.0
xlrd : 2.0.2
xlsxwriter : 3.2.9
zstandard : 0.24.0
tzdata : 2025.2
qtpy : None
pyqt5 : None

Metadata

Metadata

Assignees

No one assigned

    Labels

    BugIO HTMLread_html, to_html, Styler.apply, Styler.applymapNeeds TestsUnit test(s) needed to prevent regressions

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions