-
Notifications
You must be signed in to change notification settings - Fork 206
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
EDA plot function #5
Comments
shub970
added a commit
that referenced
this issue
May 28, 2019
Added the function plot(df, x, y) to plot bi-variate graphs. Close #5
shub970
added a commit
that referenced
this issue
Jun 3, 2019
code refactoring #5 Signed-off-by: shub970 <laddha.shubham97@gmail.com>
dovahcrow
pushed a commit
that referenced
this issue
Jun 11, 2019
Added the function plot(df, x, y) to plot bi-variate graphs. Close #5
peiwangdb
pushed a commit
that referenced
this issue
Mar 6, 2020
committer pei wang <pennyiscomputing@gmail.com> 1583453127 -0800 Initial commit doc(meta): Add contributing guide and changelog refactor(dataprep): Add project structure feat(eda.plot): Implement plot(df) and plot(df,x,y) Close #5 refactor(dataprep): creating package structure ci(CircleCI): Add circleci build(dataprep): Add pytype & pylint & pipenv refactor(eda.plot): plot function combined into one. feat(eda.plot): Implement QQ norm plot And add mypy check chore(dataprep): Add pull request template and chore commit type. chore(dataprep): Add editor related directory to .gitignore E.g. ",idea" and ".vscode" feat(eda.plot_correlation): Implement the calculation of intermediates feat(eda.plot_correlation): Impelment visualization docs(CONTRIBUTING): add more guidelines about PR feat(eda.plot_missing, eda.plot_correlation): implement plot_missing and plot_correlation feat(eda.plot): Add visualization code for plot(df, ...) fix(eda.plot_missing): fix the plot issue of categorical data style(eda.plot_missing): revise the import order build: lock in dependencies version and make setup.py PIpenv aware fix(eda.plot): fix the number of bars and bins of plot fix(eda.plot_missing): fix the color, too many categories and actual name style(eda.plot_missing): disable too-many-statements fix(eda.plot_correlation): node size, color, x-label and order fix(eda.plot_correlation): acceleration and polish figures fix(eda.plot_correlation): polish figures fix(eda.plot_missing): fix the color, too many categories and actual name style(eda): format the code using black chore(CONTRIBUTING): suggesting not having merge commits in PR chore(README): add build status badge and contribution guideline build: use black to do formatting fix(eda.plot): Changed the histogram visualization, formatted x axis labels, added bars for missing values Made aesthetic changes to the plot(df) function: changed the histogram visualization, formated labels, added additional bars for missing values, specify the number of plots that appear on each row e.g. plot(df, ncolumns=4). fix(eda.plot): Fixed comments from the pull request Fixed problems from the last pull request fix(eda.plot): Made changes to test_plot.py so that all tests pass Fixed comments. All tests passed with the changes I made to test_plot.py which were necessary to deal with missing values. fix(eda.plot): Forgot to add viz_uni.py to earlier commit Forgot to add viz_uni.py to my earlier commit fix(eda.plot): Changed colours to match with new palette Changed colours to match with new colour palette fix(eda.plot): Ran black so the code is formatted build(CircleCI): Remove docker cache We reached the limit of the free plan. fix(eda.plot): fixed problems with plot(df, x) fix(eda.plot): fixed comments from pull request fix(eda.plot): fixed additional comments from pull request fix(eda.plot): removed unnecessary lines of code and changed sorting key fix(eda.plot_correlation): Fixed comments from the Trello fix(plot_correlation): fix corresponding test Fixed comments from the Trello fix(eda.plot_correlation): Fixed comments from Trello fix(eda.plot_correlation): Fixed comments from Trello fix(eda.plot_correlation): Fixed comments from Trello fix(eda.plot_correlation): Fixed comments from the Trello fix(eda.plot_correlation): move sample_size to user function docs: use sphinx to generate the doc build: More compatibility for the dependencies build: use anchor in the circleci yaml fix(eda.plot_correlation): remove x label and self attention feat(DataConnector): Merge DataConnector into Dataprep Link to DataConnectorConfigs repo build: switch to poetry from pipenv fix(eda.plot_correlation): add document fix(eda.plot_missing): add document fix(eda.plot): merge conflict with render.py fix(eda.plot): fixed comments from pull request and bugs found in testing build: More compatibility for the dependencies refactor(eda.correlation): refactor computing code refactor(eda.correlation): refactor rendering code fix(eda): make rest of the code runnable with the new code feat(data_connector): implement OAuth2 ClientCredentials implement show_schema test fix(eda.plot_missing): add row and columns limitation fix(eda.plot_missing): add document fix(eda.plot_missing): change num_rows to bins_num fix(eda.plot_missing): fix bugs of bins_num fix(eda.plot_missing): fix details according to Jinglin comment fix(eda.plot_missing): fix color, colorbar and x-label fix(eda.plot_missing): consistent with viz_uni.py refactor(eda.missing): refactor the code into compute_* and render_* Also removes holoview feat(build): add poetry build to justfile fix(eda.correlation): slightly tweak the heatmap visualization fix(eda.correlation): Show top 30 if the cardinality is too large in the column refactor(eda-basic): refactored some of the basic plot functions refactor(eda-basic) finished refactoring the basic functions refactor(eda-basic) removed show() from render refactor(eda-basic): added the plot() function fix(eda.basic): add sanity tests and minor fixes for CI refactor(eda-basic): fixed comments from the pull request feat(data_connector): Implement Github config file fix(eda-basic): fixed the hover tooltip problem of when a column name contains a dash feat(data_connector): Implement Github config file chore: modify project info for first release fix(eda): fix not recognizing categorical dtype fix(eda-basic): fixed bugs when numerical column set as categorical fix(eda-basic): added ngroups for boxplot and formatted intervals for histogram fix(eda-correlation): optimize the correlation calculation of scatter refactor(eda.correlation): refactor for readability fix(eda-basic): fixed categorical column when values are non-strings, and updated the box plots fix(eda-basic): made plots get larger if ngroups or nsubgroups is large fix(eda.missing): decrease the min y_range to 0 for histograms The y_range min should be smaller than the minimal value in the column for histogram otherwise some bars are compressed down to the x-axis and are not visible. Also fixes not cutting off # of bars in plot_missing(df, x, y). fix(eda.missing): Fix the label order and boxplot color fix(eda.missing): accurately calculate missing spectrum using map_blocks fix(eda.missing): correctly handle categorical data type fix(data_connector): should auto re-download config files if there's an update in the config repo Update readme and examples Add images Fix readme image not working on pypi v0.1.0 fix(eda.missing): make the tooltip style align with plot(df) fix(eda.correlation): it works for the columns with missing values fix(eda.correlation): plot_correlation only supports for numerical data fix(eda-basic): fixed xtics for histograms fix(eda-basic): commented code fix(eda-basic): add variables for Jinglin comment fix(eda.correlation): fix scatter and top-k nan Committer: waterpine <songbian@zju.edu.cn> docs(dataprep.eda): add documentation add documentation for eda, plot, plot_correlation and plot_missing fix(docs): fix warnings fix(eda-basic): fixed xtic rounding fix(eda-basic): improved plot(df) efficiency feat(data-connector): support template fix(eda.missing): fix parameter names renew show schema simple .info demo refined show_schema and info methods fix(dc.info): refined data_connector.info format fix(dc.info): code revision resolve conflict with master refommat code fixed type issue further improve code further improve code blank further improve code
dovahcrow
pushed a commit
that referenced
this issue
May 29, 2020
fatbuddy
added a commit
to fatbuddy/dataprep
that referenced
this issue
Feb 10, 2024
fatbuddy
added a commit
to fatbuddy/dataprep
that referenced
this issue
Mar 15, 2024
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Goal: Plot function includes plot(df), plot(df, x="x") and plot(df, x="x", y="y")
Step 1: create intermediates
Step 2: plot graphs based on intermediates
The text was updated successfully, but these errors were encountered: