Improve method ProcessInMemmory in AnalyzeCommand. #104

ArturLavrov · 2020-01-25T15:17:34Z

Hello!
Thank's for this project. I found it useful! You guys did a great job!

Is your feature request related to a problem? Please describe.

Some people faced with an OutOfMemmoryException during the execution of AnalyzeCommand
#103, #91 .

I dived into this class(AnalyzeCommand.cs) and find a code, that I think could be improved.

I'm talking about method ProcessAsFile(string filename) in AnalyzeCommand.cs.

void ProcesAsFile(string filename)
        {
            if (File.Exists(filename))
            {
                _appProfile.MetaData.FileNames.Add(filename);
                _appProfile.MetaData.PackageTypes.Add(ErrMsg.GetString(ErrMsg.ID.ANALYZE_UNCOMPRESSED_FILETYPE));

                string fileText = File.ReadAllText(filename);
                ProcessInMemory(filename, fileText);
            }
            else
            {
                throw new OpException(ErrMsg.FormatString(ErrMsg.ID.CMD_INVALID_FILE_OR_DIR, filename));
            }
        }

I'm confused with these lines:

string fileText = File.ReadAllText(filename);
ProcessInMemory(filename, fileText);

Here we read a whole file content into memory using ReadAllText method and then call method ProcessInMemmory which verify if a length of the string exceded specified threshold:

if (fileText.Length > MAX_FILESIZE)
{
   //Some stuff goes here.
   return;
}

Fact, that we perform this operation in a loop, for each file trouble me a little bit.
Potentially this could create memory pressure.

Describe the solution you'd like

What if before reading the whole file into memory we will check the actual file size and only after that decide proceed file or not.
I think this could be achieved using FileInfo.Length property.

What do you think of this?
Feel free to correct me if I'm mistaken or didn't notice something.

Have a nice day!

The text was updated successfully, but these errors were encountered:

…cks.

Fix for #104 idea submitted by ArthusLavrov to optimize file size che…

guyacosta · 2020-01-25T20:47:26Z

Fantastic catch. Note the reason the ProcessAsFile turns around and calls ProcessInMemory is to optimized the core processing logic that is used for both compressed and uncompressed files but by placing the file size check in the Run loop I can avoid opening those large files to begin with. I do have to check again for files once decompressed but that is far better than reading large files then finding out we didn't want to. Hopefully this will improve results for #103 and #91.

guyacosta pushed a commit that referenced this issue Jan 25, 2020

Fix for #104 idea submitted by ArthusLavrov to optimize file size che…

1f56f6e

…cks.

guyacosta mentioned this issue Jan 25, 2020

Fix for #104 idea submitted by ArthusLavrov to optimize file size che… #106

Merged

guyacosta added a commit that referenced this issue Jan 25, 2020

Merge pull request #106 from microsoft/FileSizeCheckOptimization

040c341

Fix for #104 idea submitted by ArthusLavrov to optimize file size che…

guyacosta closed this as completed Jan 25, 2020

This was referenced Jan 25, 2020

Getting an OutOfMemoryException #103

Closed

Insufficient memory to continue the execution of large program. #91

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve method ProcessInMemmory in AnalyzeCommand. #104

Improve method ProcessInMemmory in AnalyzeCommand. #104

ArturLavrov commented Jan 25, 2020

guyacosta commented Jan 25, 2020

Improve method ProcessInMemmory in AnalyzeCommand. #104

Improve method ProcessInMemmory in AnalyzeCommand. #104

Comments

ArturLavrov commented Jan 25, 2020

guyacosta commented Jan 25, 2020