You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
$ ls -lh test_*.svg-rw-r----- 1 ajstewart staff 33K Dec 16 14:19 test_100.svg-rw-r----- 1 ajstewart staff 128K Dec 16 14:19 test_1000.svg-rw-r----- 1 ajstewart staff 1.0M Dec 16 14:19 test_10000.svg-rw-r----- 1 ajstewart staff 10M Dec 16 14:19 test_100000.svg-rw-r----- 1 ajstewart staff 102M Dec 16 14:20 test_1000000.svg
Note: the same behavior can be reproduced with qqplot or qqplot_2samples.
Expected Output
I would expect each plot to contain 100 points (1 for each quantile), but it seems to contain every single point in the dataset. This results in files that are too large to even render, making the plot rather useless for large datasets. Is this the intended behavior? I think it would be more useful to plot 100 points, or at least have an option to do so.
Versions
INSTALLED VERSIONS
Python: 3.10.8.final.0
OS: Darwin 21.6.0 Darwin Kernel Version 21.6.0: Thu Sep 29 20:13:56 PDT 2022; root:xnu-8020.240.7~1/RELEASE_ARM64_T6000 arm64
byteorder: little
LC_ALL: en_US.UTF-8
LANG: en_US.UTF-8
It would need an enhancement that would initiate some sort of subsampling when a flag was set. One of the issues with using some percentile is that the values shown may not be actual data points since some percentile estimation methods interpolate.
Describe the bug
It appears that the file size of Q-Q plots scales$\mathcal{O}(N)$ for data of size $N$ .
Code Sample
Note: the same behavior can be reproduced with
qqplot
orqqplot_2samples
.Expected Output
I would expect each plot to contain 100 points (1 for each quantile), but it seems to contain every single point in the dataset. This results in files that are too large to even render, making the plot rather useless for large datasets. Is this the intended behavior? I think it would be more useful to plot 100 points, or at least have an option to do so.
Versions
INSTALLED VERSIONS
Python: 3.10.8.final.0
OS: Darwin 21.6.0 Darwin Kernel Version 21.6.0: Thu Sep 29 20:13:56 PDT 2022; root:xnu-8020.240.7~1/RELEASE_ARM64_T6000 arm64
byteorder: little
LC_ALL: en_US.UTF-8
LANG: en_US.UTF-8
statsmodels
Installed: 0.13.5 (/Users/ajstewart/spack/opt/spack/darwin-monterey-m1/apple-clang-14.0.0/py-statsmodels-0.13.5-adxlc7oy3ktl4kh5vovnox6p4xfo42mu/lib/python3.10/site-packages/statsmodels)
Required Dependencies
cython: 0.29.32 (/Users/ajstewart/spack/opt/spack/darwin-monterey-m1/apple-clang-14.0.0/py-cython-0.29.32-ivsv2b4ous7evv2ubjgvg55kfhmipfha/lib/python3.10/site-packages/Cython)
numpy: 1.23.4 (/Users/ajstewart/spack/opt/spack/darwin-monterey-m1/apple-clang-14.0.0/py-numpy-1.23.4-lrwjrgphd7votelcqhtiofnwjc66qeu7/lib/python3.10/site-packages/numpy)
scipy: 1.9.3 (/Users/ajstewart/spack/opt/spack/darwin-monterey-m1/apple-clang-14.0.0/py-scipy-1.9.3-awasw63634ccwei2d4yzqawpemumjce3/lib/python3.10/site-packages/scipy)
pandas: 1.5.1 (/Users/ajstewart/spack/opt/spack/darwin-monterey-m1/apple-clang-14.0.0/py-pandas-1.5.1-ig3wajvrqdbhneo6hrxjjtqblcxjbm7b/lib/python3.10/site-packages/pandas)
dateutil: 2.8.2 (/Users/ajstewart/spack/opt/spack/darwin-monterey-m1/apple-clang-14.0.0/py-python-dateutil-2.8.2-6jntibflip4pxdx4zcobobgq6uujtrth/lib/python3.10/site-packages/dateutil)
patsy: 0.5.2 (/Users/ajstewart/spack/opt/spack/darwin-monterey-m1/apple-clang-14.0.0/py-patsy-0.5.2-abnyxte3so75czuftdj4aqc2c34hri2f/lib/python3.10/site-packages/patsy)
Optional Dependencies
matplotlib: 3.6.2 (/Users/ajstewart/.spack/.spack-env/view/lib/python3.10/site-packages/matplotlib)
backend: MacOSX
cvxopt: Not installed
joblib: 1.2.0 (/Users/ajstewart/.spack/.spack-env/view/lib/python3.10/site-packages/joblib)
Developer Tools
IPython: 8.5.0 (/Users/ajstewart/.spack/.spack-env/view/lib/python3.10/site-packages/IPython)
jinja2: 3.1.2 (/Users/ajstewart/.spack/.spack-env/view/lib/python3.10/site-packages/jinja2)
sphinx: 5.3.0 (/Users/ajstewart/.spack/.spack-env/view/lib/python3.10/site-packages/sphinx)
pygments: 2.13.0 (/Users/ajstewart/.spack/.spack-env/view/lib/python3.10/site-packages/pygments)
pytest: 7.1.3 (/Users/ajstewart/.spack/.spack-env/view/lib/python3.10/site-packages/pytest)
virtualenv: Not installed
The text was updated successfully, but these errors were encountered: