Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Duplicate values in separate groupings bug #76

Closed
EythorE opened this issue Jun 1, 2023 · 2 comments
Closed

Duplicate values in separate groupings bug #76

EythorE opened this issue Jun 1, 2023 · 2 comments
Labels
Next Release To work on for new version release

Comments

@EythorE
Copy link

EythorE commented Jun 1, 2023

There seems to be a bug where if you have duplicate values in separate groupings the plot does not show some of the rows.

import sys
import forestplot as fp
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib

print(
    f"numpy version: {sys.version}",
    f"pandas version: {pd.__version__}",
    f"matplotlib version: {matplotlib.__version__}",
    f"forestplot version: {fp.__version__}",
    sep='\n'
)
# numpy version: 3.8.1 (default, Feb  3 2020, 12:44:18) 
# [GCC 4.8.5 20150623 (Red Hat 4.8.5-39)]
# pandas version: 1.4.3
# matplotlib version: 3.4.2
# forestplot version: 0.3.1

def create_data():
    group_a = pd.DataFrame({'name': ['name_a', 'name_b'], 'estimate': [1.1, 1.0]})
    group_a['Lower CI'] = group_a['estimate'] - 0.05
    group_a['Upper CI'] = group_a['estimate'] + 0.05
    group_a['group'] = "group_a"

    group_b = group_a.copy()
    group_b['group'] = 'group_b'
    groups = pd.concat([group_a, group_b], axis=0) 
    # group_a["group"] = "group_a"
    return groups

df = create_data()
display(df)

print("Missing part of the plot")
fp.forestplot(df,
              estimate='estimate', varlabel='name', ll="Lower CI", hl="Upper CI", groupvar="group")
plt.show()

# print("Still missing part of the plot")
# df.loc[df['group'] == 'group_b', ['estimate', "Lower CI", "Upper CI"]] += 0.001
# fp.forestplot(df,
#               estimate='estimate', varlabel='name', ll="Lower CI", hl="Upper CI", groupvar="group")
# plt.show()

print("Now it works")
df.loc[df['group'] == 'group_b', ['estimate', "Lower CI", "Upper CI"]] += 0.01
fp.forestplot(df,
              estimate='estimate', varlabel='name', ll="Lower CI", hl="Upper CI", groupvar="group")
plt.show()

forest_bug

@LSYS
Copy link
Owner

LSYS commented Jun 11, 2023

@EythorE thanks so for raising this bug! i'll have to find some time to investigate this. If you can work this out, do consider opening a pull request.

@LSYS LSYS added Type: Bug Help Wanted Help wanted/needed labels Jun 11, 2023
@LSYS
Copy link
Owner

LSYS commented Dec 24, 2023

Related to #81. For future reference, duplicated labels are really groups. See the mplot feature (#59, WIP).

@LSYS LSYS added Next Release To work on for new version release and removed Type: Bug Help Wanted Help wanted/needed labels Dec 24, 2023
@LSYS LSYS closed this as completed in a2ec12d Dec 24, 2023
LSYS added a commit that referenced this issue Dec 24, 2023
* From #73 by @juancq.

* Warn about duplicated `varlabel` (closes #76, closes #81).

* Add test that above warning works.

* Add known issues about duplicated `varlabel` (closes #76, closes #81) and PyCharm (closes #80).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Next Release To work on for new version release
Projects
None yet
Development

No branches or pull requests

2 participants