Skip to content
This repository has been archived by the owner on Mar 3, 2023. It is now read-only.

Large file support #307

Closed
watsonian opened this issue Feb 25, 2013 · 140 comments
Closed

Large file support #307

watsonian opened this issue Feb 25, 2013 · 140 comments

Comments

@watsonian
Copy link
Contributor

So, I use my text editor for all kinds of things (writing code, reading readmes, editing dotfiles, etc). One of the things I do most often during any given day is open up large log files to troubleshoot Enterprise problems. These files can sometimes get as large as 500-800MB.

I just tried viewing a 350MB log file and Atom locked up immediately -- the file selector didn't change and the entire window went completely white 4 seconds after I tried viewing the file. I could still close the top level window, but we should still have better handling for this kind of thing.

Sublime Text 2 took about 55 seconds to open the file, but gave a nice progress indicator while it was loading:

Screen Shot 2013-02-24 at 4 50 56 PM

Once it loaded it was as responsive as any other file I load (scrolled fast, normal text highlight performance, etc). It would be nice if we set this as a baseline for expected behavior (progress bar + responsive after loading).

@nathansobo
Copy link
Contributor

Yeah, we currently do a blocking read to load files. We need to switch to
something event-driven. I also wonder if we can load the file progressively
off the disk as needed so we don't have to seek the whole thing into
memory. But that's super-advanced. Just avoiding blocking the UI thread
would be a good start. Yet another thing I'd like to do with Node.

On Sun, Feb 24, 2013 at 5:54 PM, Joel Watson notifications@github.comwrote:

So, I use my text editor for all kinds of things (writing code, reading
readmes, editing dotfiles, etc). One of the things I do most often during
any given day is open up large log files to troubleshoot Enterprise
problems. These files can sometimes get as large as 500-800MB.

I just tried viewing a 350MB log file and Atom locked up immediately --
the file selector didn't change and the entire window went completely white
4 seconds after I tried viewing the file. I could still close the top level
window, but we should still have better handling for this kind of thing.

Sublime Text 2 took about 55 seconds to open the file, but gave a nice
progress indicator while it was loading:

[image: Screen Shot 2013-02-24 at 4 50 56 PM]https://f.cloud.github.com/assets/244/190379/c0abc27e-7ee5-11e2-8ec4-4205cdee0ad0.png

Once it loaded it was as responsive as any other file I load (scrolled
fast, normal text highlight performance, etc). It would be nice if we set
this as a baseline for expected behavior (progress bar + responsive after
loading).


Reply to this email directly or view it on GitHubhttps://github.com//issues/307.

@watsonian
Copy link
Contributor Author

I just tried opening a 124MB JSON file directly from the CLI and got this:

RangeError: Maximum call stack size exceeded
    at BufferChangeOperation.module.exports.BufferChangeOperation.changeBuffer (/Applications/Atom.app/Contents/Resources/src/app/buffer-change-operation.js:99:17)
    at BufferChangeOperation.module.exports.BufferChangeOperation.do (/Applications/Atom.app/Contents/Resources/src/app/buffer-change-operation.js:37:19)
    at Buffer.module.exports.Buffer.pushOperation (/Applications/Atom.app/Contents/Resources/src/app/buffer.js:348:31)
    at Buffer.module.exports.Buffer.change (/Applications/Atom.app/Contents/Resources/src/app/buffer.js:314:20)
    at Buffer.module.exports.Buffer.setText (/Applications/Atom.app/Contents/Resources/src/app/buffer.js:174:19)
    at Buffer.module.exports.Buffer.reload (/Applications/Atom.app/Contents/Resources/src/app/buffer.js:126:12)
    at new Buffer (/Applications/Atom.app/Contents/Resources/src/app/buffer.js:65:14)
    at Project.module.exports.Project.buildBuffer (/Applications/Atom.app/Contents/Resources/src/app/project.js:270:16)
    at Project.module.exports.Project.bufferForPath (/Applications/Atom.app/Contents/Resources/src/app/project.js:259:33)
    at Project.module.exports.Project.buildEditSessionForPath (/Applications/Atom.app/Contents/Resources/src/app/project.js:180:41) index.html:23
window.onload

@probablycorey
Copy link

This moved forward after #939. But there is still some telepath work to get the rest working. Adding single-user mode to telepath that eliminates the need for each character to have a location may be a solution.

@stuart-warren
Copy link

I'm failing to open up a 1.4MB HTML file with the same RangeError: Maximum call stack exceeded

Running Version 0.105
Has anything happened on this issue since October?

Full error:

Window load time: 761ms 
/Applications/Atom.app/Contents/Resources/app/src/window-bootstrap.js:18
RangeError: Maximum call stack size exceeded
  at DisplayBuffer.module.exports.DisplayBuffer.updateScreenLines (/Applications/Atom.app/Contents/Resources/app/src/display-buffer.js:1246:17)
  at DisplayBuffer.module.exports.DisplayBuffer.updateAllScreenLines (/Applications/Atom.app/Contents/Resources/app/src/display-buffer.js:179:19)
  at new DisplayBuffer (/Applications/Atom.app/Contents/Resources/app/src/display-buffer.js:94:12)
  at new Editor (/Applications/Atom.app/Contents/Resources/app/src/editor.js:76:30)
  at Project.module.exports.Project.buildEditorForBuffer (/Applications/Atom.app/Contents/Resources/app/src/project.js:430:16)
  at /Applications/Atom.app/Contents/Resources/app/src/project.js:192:24
  at _fulfilled (/Applications/Atom.app/Contents/Resources/app/node_modules/q/q.js:787:54)
  at self.promiseDispatch.done (/Applications/Atom.app/Contents/Resources/app/node_modules/q/q.js:816:30)
  at Promise.promise.promiseDispatch (/Applications/Atom.app/Contents/Resources/app/node_modules/q/q.js:749:13)
  at /Applications/Atom.app/Contents/Resources/app/node_modules/q/q.js:557:44
  at flush (/Applications/Atom.app/Contents/Resources/app/node_modules/q/q.js:108:17)
  at process._tickCallback (node.js:605:11)

Thanks

@probablycorey
Copy link

Opening files larger than 2MB is an edge case that most users don't run into, but I wanted to investigate this.

How to create a large file

Download this file and run it generate-text five-megabytes-of-words.txt 5

Good news

Atom handles 5MB without changing anything.

Bad news

Atom hangs when opening a a file >10MB.

Atom eats up ~300MB or RAM when opening large files.

activity monitor all processes

@probablycorey
Copy link

I modified window-bootstrap to test out some memory assumptions. Here is the memory footprint of Atom using a file size of 40MB

Baseline (window-bootstrap.coffee does nothing): 50MB
Baseline + file via fs: 96MB
Baseline + file via File: 181MB
Baseline + TextBuffer: 538MB

@nathansobo
Copy link
Contributor

I'm thinking we probably need to store the TextBuffer data off the v8 heap using a C++ extension, otherwise it hoses GC performance.

@nathansobo
Copy link
Contributor

But there's also a ton of other data such as tokenization etc.

@zcbenz
Copy link
Contributor

zcbenz commented Jul 11, 2014

Node's Buffer stores data in C++ heap. But if we want to support really big files we have to avoid reading the whole file in memory.

@nathansobo
Copy link
Contributor

Interested in any ideas you have about the design of that @zcbenz. The hardest part might be a bunch of APIs might need to go async to support loading more of the file for certain operations, which could make for more awkward scripting.

@zcbenz
Copy link
Contributor

zcbenz commented Jul 11, 2014

The basic idea is to only draw the file on needed, but it will need a good caching strategy to make things smooth since drawing file would rely on disk IO, many native apps use memory-mapped files to simplify it, a rough design is:

when opening a large file

  1. quickly scans the whole file to count there are how many lines and how long is each line.
  2. the editor view computes and draws a virtual scroller bar.
  3. the editor view decides which part of file should be read into memory according to current position of scroller.
  4. the editor view reads that part of the file and draws them.

when user scrolls the editor view
repeat 3 - 4

when the editor view is resized
repeat 2 - 4

when the file is changed outside
repeat 1 - 4

  • step 1 and step4 could be asynchronous.
  • the step 1 will need to take the responsibility of tokenization.
  • when file is really large that even scanning it takes time we may give up scanning and tokenizing it, the editor view can just show scroller bar with a minimal drag handle, and when user drags the scroller the editor view just roughly computes a start position to read file.

@i4004
Copy link

i4004 commented Jul 25, 2014

I tried to open ~8gb text file, but nothing just happened :)

@batjko
Copy link
Contributor

batjko commented Jul 25, 2014

I'm not sure how they do it, but Baretail is a pretty impressive log file viewer in that regard.
It doesn't care how big your file is and only seems to load portions of it as needed, with very fast seek and search across the entire monster file though, including life refresh of the bottom lines (useful for log files of course) - while keeping the memory footprint quite small.

I assume that's rather difficult to implement in Atom, maybe too much effort for the rare use case?

@nathansobo
Copy link
Contributor

@batjko Thanks for the tip. Long term we definitely want to support files of arbitrary size, but an approach that doesn't load the whole file into memory is going to force us to redesign some of our synchronous APIs to be asynchronous due to the single-threaded nature of JavaScript. We made a pragmatic decision early on to perform editor state manipulations synchronously for a more convenient scripting experience, but we may need to revisit that decision.

@batjko
Copy link
Contributor

batjko commented Jul 25, 2014

@nathansobo Personally, I'm happy with it as it is. I have no use for extreme file sizes in an editor.

I think the huge log file scenario is not really one for editors, but rather for viewers/greppers or whatever you might call that category.

@nathansobo
Copy link
Contributor

Yeah, I pretty much agree. I'm confident we'll get there eventually, but there's a lot of other things we consider more important right now.

@simonzack
Copy link

@batjko I don't think this is a rare use case, given the amount of text editors which advertise this as a feature, and the comments so far on this issue. Log viewers and greppers are nice, but it is nicer to view and edit text in an editor. After all isn't that what a text editor is for.

@detly
Copy link

detly commented Aug 6, 2014

The other aspect to this is the failure mode while it isn't supported. If viewing and editing large text files isn't going to work for the time being, it might still be nice to have a better failure mode than locking up and forcing the user to kill Atom.

@nathansobo
Copy link
Contributor

The failure mode is currently an error message displayed in the console for files exceeding 2mb, which should leave Atom in a usable state. Are you talking about large files that are < 2mb?

@detly
Copy link

detly commented Aug 7, 2014

Yes. This is a 1.4M (generated) Python file.

@izuzak
Copy link
Contributor

izuzak commented Aug 12, 2014

Yes. This is a 1.4M (generated) Python file.

@detly Can you share the file for which this is happening? Also, does this happen when you run atom in safe mode? Anything else special about the file (e.g. a really long line)?

@izuzak
Copy link
Contributor

izuzak commented Aug 12, 2014

Can you share the file for which this is happening?

Actually, I just tried this myself and Atom was slowing down drastically on large files like that. I thought that wasn't happening on files of that size -- I thought I tried that a while ago but I might be wrong. Sorry about that!

@erikdonohoo
Copy link

Super happy to hear this is being worked on. Its not just large binary files that are issues here. Any js framework (like angular, ember) end up being too large for atom to handle. It doesn't seem like an edge case to occasionally need to look at the source of some 3rd party package you are using.

@wpostma
Copy link

wpostma commented Mar 27, 2015

This editor is a toy, if it can't handle large files. I'll be watching to see if Atom matures in this area.

@murphyj
Copy link

murphyj commented Mar 30, 2015

Agree with @erikdonohoo - this is not an edge case. You also can't open some crucial files for Ghost because it's ember.js backed (hence has large binary vendor files).

@sduensin
Copy link

sduensin commented Apr 3, 2015

This is a HUGE problem. 2 meg is nothing these days. As much as I like Atom, this limitation makes it pretty useless for day to day work. :-(

@sceee
Copy link

sceee commented Apr 3, 2015

I can support this - many files (let it be log files etc.) have more than 2 MB in size. This always makes you need another editor to open these files beside atom.

@gpluta
Copy link

gpluta commented Apr 5, 2015

This limitation is really a dealbreaker when it comes to using atom... I hope this gets improved one day.

@mindvox
Copy link

mindvox commented Apr 7, 2015

This should be considered a standard feature for a modern text editor. I would really love to see this feature added as Atom is such an amazing text editor.

@50Wliu
Copy link
Contributor

50Wliu commented Apr 8, 2015

This is currently being worked on. The Atom team recognizes that it's a big problem, but it also needs a big rewrite in order to make everything work. See https://github.com/atom/text-document for more details.

@mindvox
Copy link

mindvox commented Apr 8, 2015

@50Wliu I can appreciate the work involved. Looking forward to seeing this in a future release 😄

@backspaces
Copy link

Could we pin this? The extraordinary atom/text-document should be shown to
all those concerned.

I say this because Atom is historic, like several JS/Html/Browser/Node
projects have been over the last few years.

Many of Atom's clients are simply looking for a coherent text editor.
That's fine. But they should also understand the JavaScript Revolution
background.

On Tue, Apr 7, 2015 at 6:19 PM, Karl Bateman notifications@github.com
wrote:

@50Wliu https://github.com/50Wliu I can appreciate the work involved.
Looking forward to seeing this in a future release [image: 😄]


Reply to this email directly or view it on GitHub
#307 (comment).

@Tapefabrik
Copy link

+1

@mika76
Copy link

mika76 commented Apr 8, 2015

I must admit these days the only thing keeping me from using Atom is the performance and the large file non-support. I can't wait to see this working 😁

@xpepermint
Copy link

+1

@theksmith
Copy link

+100

@pixelchutes
Copy link

+2.0001 (MB) ...wait for it... 😉

UPDATE:
@thedaniel is right guys. Sorry for the spam / bad joke.

@thedaniel
Copy link
Contributor

This is actively being worked on and is a high priority, no reason to post +1 comments anymore - it just spams everyone watching the issue.

@iamstarkov
Copy link

@thedaniel issues can be closed for commenting

@ilanbiala
Copy link

@thedaniel I think it would help if any other possible progress updates regarding perf. improvements were provided. If there aren't any, it would just be nice to know that it's happening by @nathansobo or you commenting in addition to adding the in progress label.

@nathansobo
Copy link
Contributor

It's happening. Check out the text-document repo. Out of the country for a conference which is consuming some attention, but it's been almost my sole focus. Not much more to say than the progress and write-ups you can observe there. I empathize with everyone who wants this because I want it too, and it's been challenging but we're definitely getting there.

@robhawkes
Copy link

@thedaniel @matmuchrapna: Yup, it's simple to lock the comments while still allowing contributors to discuss things.

@kylegoetz
Copy link

I disagree this is an edge case. There are popular third-party JS libraries that are greater than this limit. What motivates my finding this bug is that Telerik's Kendo library is too big to open in Atom unless you open the minified version, which is, of course, not useful at all.

I'm having to read the library in Firefox's debugger since I can't open it in Atom to read it. You can argue philosophy about whether I should be opening a file I don't intend to edit in Atom, but I might edit it if I need to and just consider it no longer a proper third party lib but rather a first party, but can't edit in Atom.

@atom atom locked and limited conversation to collaborators Apr 14, 2015
@benogle
Copy link
Contributor

benogle commented Apr 14, 2015

I've locked this. We do believe this is an important case to handle. We are currently working on this. Please visit https://github.com/atom/text-document for more information.

@nathansobo
Copy link
Contributor

Just an update here... in the interest of time we've decided to refocus our efforts on more incremental improvements.

  • One large chunk of work from text-document, the MarkerIndex, is in the process of being integrated into our existing document model. This will improve Atom's performance in the presence of large numbers of markers, such as when visualizing numerous search results in large files. The pull request should be merged in the next few days.
  • We're also working to reduce the memory consumption of our tokenized line representation.

We'll be searching for several more smaller wins like this, then redirect our attention back to some of the bigger changes researched in text-document.

Some ideas for other areas we plan on investigating in the next couple weeks:

  • Looking for low hanging fruit in the initial load of a document.
  • Looking for low hanging fruit in parser performance.
  • Exploring batching display buffer updates for changes within a single transaction.

Making Atom performant for files of all sizes remains a top priority. We'll continue to post updates here.

@nathansobo
Copy link
Contributor

There's still more to do here, but a basic fix for the 80% case of loading large files is now on master. Before 1.0 I'd like to fix a few performance hiccups when moving the cursor in huge files. Post 1.0 I'd like to:

  • Syntax highlight in a background process so we can highlight huge files without hiccups.
  • Show a progress bar while performing I/O and computing initial metadata while loading a large file.
  • Drop the memory overhead another order of magnitude.
  • Support folds in large files

We can create separate issues for all of those, but I'm going to close this one. Atom can now load and edit large files. Don't worry, we'll keep working on refining it.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests