Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle large history file properly by reading lines in the streaming way #3810

Merged
merged 3 commits into from
Oct 3, 2023
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
51 changes: 50 additions & 1 deletion PSReadLine/History.cs
Original file line number Diff line number Diff line change
Expand Up @@ -457,12 +457,61 @@ private void ReadHistoryFile()
{
WithHistoryFileMutexDo(1000, () =>
{
var historyLines = File.ReadAllLines(Options.HistorySavePath);
var historyLines = ReadHistoryLinesImpl(Options.HistorySavePath, Options.MaximumHistoryCount);
UpdateHistoryFromFile(historyLines, fromDifferentSession: false, fromInitialRead: true);
var fileInfo = new FileInfo(Options.HistorySavePath);
_historyFileLastSavedSize = fileInfo.Length;
});
}

static IEnumerable<string> ReadHistoryLinesImpl(string path, int historyCount)
{
const long offset_1mb = 1048576;
const long offset_05mb = 524288;

// 1mb content contains more than 34,000 history lines for a typical usage, which should be
// more than enough to cover 25,000 history records (a history record could be a multi-line
// command). Similarly, 0.5mb content should be enough to cover 10,000 history records.
// We optimize the file reading when the history count falls in those ranges. If the history
// count is even larger, which should be very rare, we just read all lines.
long offset = historyCount switch
{
<= 10000 => offset_05mb,
<= 25000 => offset_1mb,
_ => 0,
};

using var fs = new FileStream(path, FileMode.Open);
using var sr = new StreamReader(fs);

if (offset > 0 && fs.Length > offset)
{
// When the file size is larger than 1mb, we only read the last 1mb content from the end.
int? b1 = null, b2 = null;
fs.Seek(-offset, SeekOrigin.End);

// After seeking, the current position may point at the middle of a history record, or even
// a byte within a unicode char. So, we need to find the start of the next history record.
while ((b2 = fs.ReadByte()) is not -1)
{
// Read bytes until we find the first newline ('\n' == 0xA) that is not right after a backtick ('`' == 0x60).
// It means a separate full history record will start from the next byte.
if (b2 is 0xA && b1.HasValue && b1 is not 0x60)
daxian-dbw marked this conversation as resolved.
Show resolved Hide resolved
{
break;
daxian-dbw marked this conversation as resolved.
Show resolved Hide resolved
}

b1 = b2;
}
}

// Read lines in the streaming way, so it won't consume to much memory even if we have to
// read all lines from a large history file.
while (!sr.EndOfStream)
{
yield return sr.ReadLine();
}
}
}

void UpdateHistoryFromFile(IEnumerable<string> historyLines, bool fromDifferentSession, bool fromInitialRead)
Expand Down