Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added excel normalization code and associated tests #3132

Merged
merged 7 commits into from Aug 25, 2023

Conversation

IamEzio
Copy link
Contributor

@IamEzio IamEzio commented Aug 4, 2023

Fixes #2994

Related to #3027

This PR adds excel normalization code, that is, removes null rows and columns when reading the excel files.

Screenshots

excel-norm.mp4

Checklist

  • My pull request has a descriptive title (not a vague title like Update index.md).
  • My pull request targets the develop branch of the repository
  • My commit messages follow best practices.
  • My code follows the established code style of the repository.
  • I added tests for the changes I made (if applicable).
  • I added or updated documentation (if applicable).
  • I tried running the project locally and verified that there are no
    visible errors.

Developer Certificate of Origin

Developer Certificate of Origin
Developer Certificate of Origin
Version 1.1

Copyright (C) 2004, 2006 The Linux Foundation and its contributors.
1 Letterman Drive
Suite D4700
San Francisco, CA, 94129

Everyone is permitted to copy and distribute verbatim copies of this
license document, but changing it is not allowed.


Developer's Certificate of Origin 1.1

By making a contribution to this project, I certify that:

(a) The contribution was created in whole or in part by me and I
    have the right to submit it under the open source license
    indicated in the file; or

(b) The contribution is based upon previous work that, to the best
    of my knowledge, is covered under an appropriate open source
    license and I have the right under that license to submit that
    work with modifications, whether created in whole or in part
    by me, under the same open source license (unless I am
    permitted to submit under a different license), as indicated
    in the file; or

(c) The contribution was provided directly to me by some other
    person who certified (a), (b) or (c) and I have not modified
    it.

(d) I understand and agree that this project and the contribution
    are public and that a record of the contribution (including all
    personal information I submit with it, including my sign-off) is
    maintained indefinitely and may be redistributed consistent with
    this project or the open source license(s) involved.

@IamEzio IamEzio requested a review from dmos62 August 4, 2023 15:03
@IamEzio
Copy link
Contributor Author

IamEzio commented Aug 4, 2023

@dmos62 PTAL. Thanks!

Copy link
Member

@Anish9901 Anish9901 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good overall, just a minor suggested change.

Comment on lines +40 to +42
if all(df.columns.str.startswith('Unnamed')):
df.columns = df.iloc[0]
df = df[1:]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there should be a conderation here for whether or not the first row of the data is a header in #3030.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that is planned for the coming PRs.

mathesar/tests/api/test_table_api.py Outdated Show resolved Hide resolved
@Anish9901 Anish9901 assigned IamEzio and unassigned dmos62 Aug 11, 2023
@Anish9901 Anish9901 added pr-status: revision A PR awaiting follow-up work from its author after review and removed pr-status: review A PR awaiting review labels Aug 11, 2023
@dmos62
Copy link
Contributor

dmos62 commented Aug 16, 2023

Edited top post to have this close #2994. There might be follow up PRs that will touch up on the feature, as outlined in #3027, but those file types will be moderately accepted by the time this PR is merged.

Co-authored-by: Anish Umale <umaleanish120@gmail.com>
@IamEzio IamEzio requested a review from Anish9901 August 18, 2023 10:33
Copy link
Member

@Anish9901 Anish9901 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, this looks good to me, thanks for your work on this @IamEzio!

@Anish9901 Anish9901 added this pull request to the merge queue Aug 18, 2023
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Aug 18, 2023
@Anish9901 Anish9901 added this pull request to the merge queue Aug 18, 2023
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Aug 18, 2023
@Anish9901 Anish9901 added this pull request to the merge queue Aug 18, 2023
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Aug 18, 2023
@Anish9901 Anish9901 added this pull request to the merge queue Aug 20, 2023
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Aug 20, 2023
@Anish9901 Anish9901 added this pull request to the merge queue Aug 25, 2023
Merged via the queue into mathesar-foundation:develop with commit ca66cce Aug 25, 2023
9 checks passed
@IamEzio IamEzio deleted the excel-normalization branch August 28, 2023 03:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pr-status: revision A PR awaiting follow-up work from its author after review
Projects
No open projects
Development

Successfully merging this pull request may close these issues.

Support xls and xlsx
4 participants