Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CSV format is broken #332

Closed
alexey-milovidov opened this issue Apr 9, 2022 · 8 comments
Closed

CSV format is broken #332

alexey-milovidov opened this issue Apr 9, 2022 · 8 comments
Labels
bug Something isn't working

Comments

@alexey-milovidov
Copy link
Contributor

Describe the bug
If filename contains , character, it is not properly quoted in CSV.

To Reproduce
Clone this repo and run scc: https://github.com/freeCodeCamp/freeCodeCamp
The offending file name is curriculum/challenges/italian/08-data-analysis-with-python/numpy/accessing-and-changing-elements,-rows,-columns.md.

Expected behavior
Put filename in quotes ("filename") if needed. The quote " characters should be also escaped as "",
example: if filename is Hello, "world", it should be output as "Hello, ""world""".
Also take care for newlines, tabs and everything else in filenames.

Desktop (please complete the following information):

  • OS: Linux
  • Version 3.0.0.
@alexey-milovidov
Copy link
Contributor Author

Another issue: it outputs one extra empty line at the end.

@boyter boyter added the bug Something isn't working label Apr 10, 2022
@boyter
Copy link
Owner

boyter commented Apr 10, 2022

Thats odd... I was using the standard library CSV stuff in Go which I would have thought would solve this.

@KAAtheWiseGit
Copy link
Contributor

I checked this right now on the Free Code camp repository and my testing one, and the quoting seems to work properly:

Language,Location,Filename,Lines,Code,Comments,Blanks,Complexity,Bytes
Python,"dir,with,quotes/another,comma,.py","another,comma,.py",1,1,0,0,0,17
Python,",name,with,commas.py",",name,with,commas.py",1,1,0,0,0,17

@boyter
Copy link
Owner

boyter commented Apr 27, 2023

Neat perhaps that's the trick then.

When I get some time ill have another look at this. Unless someone wants to solve it before then :)

@alexey-milovidov
Copy link
Contributor Author

Not fixed.

For example, see https://github.com/anthraxx/linux-hardened/tree/master/drivers/staging/mt7621-pci
It has a file, named mediatek,mt7621-pci.txt

And the scc output is:

Plain Text,anthraxx/linux-hardened/drivers/staging/mt7621-pci/mediatek,mt7621-pci.txt,mediatek,mt7621-pci.txt,104,87,0,17,0,3327

The invocation is as follows: ~/go/bin/scc --format csv-stream

@KAAtheWiseGit
Copy link
Contributor

I looked into the code and csv-stream, unlike the cvs format, does
not allocate any memory, so it does not use the standard CSV library.
There was a bug with that, which was fixed by e77be1a. But it hasn't
been released yet.

@boyter
Copy link
Owner

boyter commented May 7, 2023

Yep. I am still working on getting the walking to respect all ignore files properly first, once thats done I can come along and clean up a lot of these issues.

@boyter
Copy link
Owner

boyter commented May 10, 2024

With the latest release 3.3.3 this is included now so closing down.

@boyter boyter closed this as completed May 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants