Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve TSV serialization when copying cells #2867

Merged
merged 4 commits into from Jun 5, 2023
Merged

Conversation

seancolsen
Copy link
Contributor

Fixes #2811

Notes

This PR adds the PapaParse library and uses it to serialize cell data to a TSV string when copying cells to the clipboard.

Before/after

  • Set up four cells with the following content. Then copy them and inspect your clipboard

    a b
    apple
    banana, orange
    cherry⏩tomato
    "asparagus
    potato"
    "broccoli"
    carrot

    (with the ⏩ emoji representing a tab character)

  • Before

    The clipboard contents are:

    apple
    banana, orange⏩cherry⏩tomato
    "asparagus
    potato"⏩"broccoli"
    carrot
    

    Pasting into LibreOffice Calc or Google Sheets leads to mangled data.

  • After

    The clipboard contents are:

    "apple
    banana, orange"⏩"cherry⏩tomato"
    """asparagus
    potato"""⏩"""broccoli""
    carrot"
    

    Pasting into LibreOffice Calc and Google Sheets works as expected.

Bundle size increase

Before After Change
main.js 1355.70 KiB 1375.42 KiB + 19.72 KiB
main.js + gzip 374.19 KiB 381.86 KiB + 7.67 KiB

Choice of library

Key

  • ⭐ = GitHub stars
  • 💾 = NPM downloads per week
  • 🗓️ = time since most recent NPM update
  • 📎 = NPM dependencies
  • 🧩 = NPM dependents
  • 📦 = NPM reported package size

The one I chose

  • papaparse
    • npm | GH
    • 11.3k ⭐ | 1.8M 💾 | 2 mo 🗓️ | 0 📎 | 1376 🧩 | 260 kB 📦

Alternatives worth considering

  • csv

    • npm | GH
    • 3.5k ⭐ | 3M 💾 | 0 mo 🗓️ | 0 📎 | 1879 🧩 | 2000 kB 📦
    • Concerned about bundle size, though it's hard to tell how much our bundle size would actually increase after tree-shaking. This project is split into separate packages, csv-parse, csv-stringify etc.
  • json-2-csv

    • npm | GH
    • 0.3k ⭐ | 0.1M 💾 | 2mo 🗓️ | 2 📎 | 170 🧩 | 100 kB 📦
    • Looks to be newer

Tried and didn't like

  • csv-string
    • npm | GH
    • 0.1k ⭐ | 0.1M 💾 | 7 mo 🗓️ | 0 📎 | 127 🧩 | 30 kB 📦
    • More recent, API looks nice
    • I tried this and ran into this open issue because a polyfill is required.

Others I looked at

  • csv-parser

    • npm | GH
    • 1.3k ⭐ | 1.3M 💾 | 2 yr 🗓️ | 1 📎 | 707 🧩 | 30 kB 📦
    • Doesn't handle serialization
  • comma-separated-values

    • npm | GH
    • 1.5k ⭐ | 0.04M 💾 | 8 yr 🗓️ | ? 📎 | ? 🧩 | ? 📦
    • Concerned about lack of maintenance
  • d3-dsv | GH

    • 0.4k ⭐ | 2.1M 💾 | ? 🗓️ | 3 📎 | 316 🧩 | 51 kB 📦
    • Seems to be mostly geared towards integration with D3
  • csvtojson

    • npm | GH
    • 1.9k ⭐ | 0.7M 💾 | 4 yr 🗓️ | ? 📎 | ? 🧩 | ? 📦
    • Might be overkill, doesn't handle serialization
  • neat-csv

    • npm
    • Just a wrapper around csv-parser
  • tsv

    • npm
    • Unmaintained
  • zsv-lib

    • GitHub
    • wasm-based
    • still in alpha

Checklist

  • My pull request has a descriptive title (not a vague title like Update index.md).
  • My pull request targets the develop branch of the repository
  • My commit messages follow best practices.
  • My code follows the established code style of the repository.
  • I added tests for the changes I made (if applicable).
  • I added or updated documentation (if applicable).
  • I tried running the project locally and verified that there are no visible errors.

Developer Certificate of Origin

Developer Certificate of Origin
Developer Certificate of Origin
Version 1.1

Copyright (C) 2004, 2006 The Linux Foundation and its contributors.
1 Letterman Drive
Suite D4700
San Francisco, CA, 94129

Everyone is permitted to copy and distribute verbatim copies of this
license document, but changing it is not allowed.


Developer's Certificate of Origin 1.1

By making a contribution to this project, I certify that:

(a) The contribution was created in whole or in part by me and I
    have the right to submit it under the open source license
    indicated in the file; or

(b) The contribution is based upon previous work that, to the best
    of my knowledge, is covered under an appropriate open source
    license and I have the right under that license to submit that
    work with modifications, whether created in whole or in part
    by me, under the same open source license (unless I am
    permitted to submit under a different license), as indicated
    in the file; or

(c) The contribution was provided directly to me by some other
    person who certified (a), (b) or (c) and I have not modified
    it.

(d) I understand and agree that this project and the contribution
    are public and that a record of the contribution (including all
    personal information I submit with it, including my sign-off) is
    maintained indefinitely and may be redistributed consistent with
    this project or the open source license(s) involved.

@seancolsen seancolsen added this to the Next release milestone May 9, 2023
@seancolsen seancolsen added the pr-status: review A PR awaiting review label May 9, 2023
Copy link
Member

@pavish pavish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great!

@pavish pavish added this pull request to the merge queue Jun 5, 2023
Merged via the queue into develop with commit 289017f Jun 5, 2023
22 checks passed
@pavish pavish deleted the 2811_clipboard_tsv branch June 5, 2023 20:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pr-status: review A PR awaiting review
Projects
No open projects
Development

Successfully merging this pull request may close these issues.

Improve TSV serialization for tab and newline characters in cell value when copying cells to the clipboard
2 participants