Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[json] Unable to format large JSON file #79014

Closed
old-profile opened this issue Aug 13, 2019 · 12 comments
Closed

[json] Unable to format large JSON file #79014

old-profile opened this issue Aug 13, 2019 · 12 comments
Assignees
Labels
editor-wrapping Editor line wrapping issues freeze-slow-crash-leak VS Code crashing, performance, freeze and memory leak issues
Milestone

Comments

@old-profile
Copy link

Issue Type: Bug

I have a JSON file that is 5 MB large. When I try to format it nothing happens.

VS Code version: Code 1.36.1 (2213894, 2019-07-08T22:59:35.033Z)
OS version: Windows_NT x64 10.0.17134

@yoshiask
Copy link

VS Code has issues tokenizing large files. The algorithm is not perfect (nor do I know better alternatives). Try using a computer with a better CPU and/or more RAM, as it may help a bit. If you don't have access to a more powerful PC, maybe all you have to do is wait for a while as it does its work.

@old-profile
Copy link
Author

image
I dont think CPU or memory are the issue. I have used Notepad++ and Atom in the past and both of them could easily format JSON files of similar sizes. It seems to me that that algorithm could use some optimizations.

@pramit-marattha
Copy link

image

@alexdima
Copy link
Member

I could not reproduce with generating a JSON using the following:

const fs = require('fs');

let obj = {};
for (let i = 0; i <103210;i ++) {
	obj[`a${i}`] = 'asdasdadasdasdadasdasdasdasd';
}

fs.writeFileSync('out.json', JSON.stringify(obj));

After running "Format Document" it takes some time, but the JSON gets formatted:

format

@alexdima alexdima added the info-needed Issue requires more information from poster label Aug 13, 2019
@gajduk
Copy link

gajduk commented Aug 13, 2019

I am glad you are looking into this. As you can see already in your example it already takes quite some time, and there is no indication that something is happening. That aside in my case I waited for several minutes!!! and nothing happened.

I can not post the original JSON that I have because it has a ton of private data but I can share some patterns that I see:

  1. it is is nested many levels deep ~ 20, unlike yours which is all on the same level
  2. some values are very long as they represent base64 encoded png files
  3. some values are escaped large json strings so they have a lot of escaped special json characters such as brackets, quotes, commas etc
    Here is a modified script to generate such a file:
const fs = require('fs');

function randomStringShort() {
    return Math.random().toString(36).substring(7);
}

function randomStringLong() {
    let res = '';
    for (let i = 0; i < 50 ; i++) {
        res += Math.random().toString(36);
    }
    return res;
}

function randomString() {
    if ( Math.random() < 0.05 ) return randomStringLong();
    else                       return randomStringShort();
}


function generateNestedRec(level) {
    if ( level > 20 || Math.random() < 0.35 ) return randomString()
    if ( level > 5  && Math.random() < 0.6 ) return randomString(); // do not generate too much data
    let res = {};
    for (let i = 0; i <5;i ++) {
        if ( level == 1 &&  i > 2 ) {
            res[randomStringShort()] = JSON.stringify(res); // long escaped json string
        }
        else {
            res[randomStringShort()] = generateNestedRec(level+1);
        }
    }
    return res;
}

let obj = generateNestedRec(0);
fs.writeFileSync('out.json', JSON.stringify(obj));

Attached is a generated sample file that not only fails to format but actually crashes VS code
out.txt

@alexdima
Copy link
Member

Formatting for me freezes VS Code for a while, but does not lead to a crash:

image

Things that can be improved:

  • the view model is constructed twice, once as a result of the model content changes, and once as a result of the line numbers width increasing
  • bracket matching is not time bound

@alexdima alexdima added freeze-slow-crash-leak VS Code crashing, performance, freeze and memory leak issues bug Issue identified by VS Code Team member as probable bug editor-wrapping Editor line wrapping issues and removed info-needed Issue requires more information from poster labels Aug 13, 2019
@rjk
Copy link

rjk commented Dec 12, 2019

I consistently get a crash on a 13MB json file. I'm happy to share it directly with anyone working on this, but can't post it publicly.

fyi - when I format it using powershell it becomes 120MB in size. Anyone wanting to do similarly in PS can use something like this:

Get-Content -Raw -Path yourfile.json | ConvertFrom-Json | ConvertTo-Json -depth 100 | Set-Content yourfile-pretty.json

@old-profile
Copy link
Author

Since Visual Studio Code is way too slow and unstable I set on a search for a formatting tool. Found this gem: https://www.jsonformatter.io/
It is blazing fast and has never crashed on me. I have put in 50 MB files in and it was able to format them almost instantly. Maybe the people from visual studio code can acquire the code behind it.

@alexdima
Copy link
Member

This file from #79014 (comment) is working a lot better for me after the changes from #88405

When I run Format, it takes the language service a while to compute the edits, and once the edits are in, it takes the editor around 1s to apply them (264k edits!!), which I think is reasonable:
image

@rjk Can you please share with us a JSON file that reproduces crashing

@alexdima alexdima added info-needed Issue requires more information from poster and removed bug Issue identified by VS Code Team member as probable bug labels Jan 10, 2020
@alexdima alexdima added this to the January 2020 milestone Jan 10, 2020
@rjk
Copy link

rjk commented Jan 10, 2020

I just retested now.

Using 1.40.2 it crashed VSCode. Steps:

  • opened VS Code
  • opened the JSON file
  • right-click on document > Format Document
  • wait about 30s
  • a dialog displayed saying VS Code had to close, and I had the option to have it restart. (Sorry, took no screenshot and didn't note exact wording).

Using VS Code 1.41.1 with same steps I instead get a message 'Extension host terminated unexpectedly' and there's a 'Javascript heap out of memory' error in devtools. VS Code memory usage got to about 3.8GB. VS Code remains open and usable.
The first time I restarted extension host and tried again to Format Document, VS Code hung. Steps:

  • Click Restart Extension Host
  • wait till right-click menu included Format Document
  • clicked Format Document
  • VS Code hung. CPU was low ~30% and memory ~1.2GB, but UI unresponsive
  • I get 'The window is no longer responding' messages (Reopen/Keep Waiting/Close).
  • After about the 3rd/4th one (with a couple of minutes waiting in between each display of the dialog and me clicking Keep Waiting) I chose Close. VS Code closed.

After launching VS Code again I was able to replicate the Extension Host crashing but not the VS Code hang. I haven't tried more than once.

I've emailed the file to you now @alexdima

@alexdima
Copy link
Member

@aeschli The JSON formatter gives 1_400_000 edits for this particular file. In any case, the editor will merge if more than 1_000 edits are applied into a single edit. But a lot of time and memory is wasted to transmit those 1_400_000 edits, to validate the ranges, etc.

Would it be possible to change the JSON formatter to return a single huge change in this case?

@alexdima alexdima assigned aeschli and unassigned alexdima Jan 20, 2020
@aeschli aeschli changed the title Unable to format large JSON file [json] Unable to format large JSON file Jan 22, 2020
@aeschli aeschli removed the info-needed Issue requires more information from poster label Jan 22, 2020
@aeschli
Copy link
Contributor

aeschli commented Jan 22, 2020

Fixed by ce31ace

@aeschli aeschli closed this as completed Jan 22, 2020
@vscodebot vscodebot bot locked and limited conversation to collaborators Mar 7, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
editor-wrapping Editor line wrapping issues freeze-slow-crash-leak VS Code crashing, performance, freeze and memory leak issues
Projects
None yet
Development

No branches or pull requests

7 participants