Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add free memory detection and Notification to user #1295

Closed
thadguidry opened this issue Oct 28, 2017 · 16 comments
Closed

Add free memory detection and Notification to user #1295

thadguidry opened this issue Oct 28, 2017 · 16 comments
Assignees
Labels
Good First Issue Indicates issues suitable for newcomers to design or coding, providing a gentle introduction. preferences Storage and editing of user preferences Type: Feature Request Identifies requests for new features or enhancements. These involve proposing new improvements.
Milestone

Comments

@thadguidry
Copy link
Member

thadguidry commented Oct 28, 2017

... on 1st startup of OpenRefine.
Also add better/extra batch script output to warn user of low memory prior to starting Java VM.

This needs to work from Windows, Linux, and OSX.

Windows: C:\Users\Thad>wmic os get FreePhysicalMemory /format:list
Linux: free -m | grep -oP '\d+' | head -n 1
OSX: vm_stat ??? but what additional options to parse out the free mem ? or does top -l 1 | grep ^PhysMem get us closer to that ?

Question: Do we assume 50% of that for OpenRefine use to be a good citizen or can we push it to 75% ?

@thadguidry thadguidry self-assigned this Oct 28, 2017
@thadguidry thadguidry added this to the 2.7.2 milestone Oct 28, 2017
@jackyq2015
Copy link
Contributor

@thadguidry Auto setting is fine. But leaving the decision to end user might also an option. Your question is hard to answer because only end user knows their situation such as size of data set, complexity etc. Instead of auto set it, a responsive warning message when Out of Mem occurs could helpful

@thadguidry
Copy link
Member Author

"Instead of auto set it, a responsive warning message when Out of Mem occurs could helpful"
@jackyq2015 Indeed, that also is needed.

@ettorerizza
Copy link
Member

ettorerizza commented Oct 29, 2017

You are obviously right, @jackyq2015 . But we can safely bet that, nowadays, most users will increase the memory and give it a default level based on their hardware capabilities. They will then increase it on a case-by-case basis. Half of the RAM seems to me a good starting point. I give more personally (about 75%, 12GB out of 16), but I am kind of heavy user.

@magdmartin
Copy link
Member

Can we allow the user to set up the memory from the preference page? We can show to the user how much memory is available on the machine and let him/her adjust.

like @jackyq2015 mentioned, I will cautious to auto set the memory as 1) we don't know the size of the data set and 2) we don't know if the user is running and other RAM greedy application

@thadguidry
Copy link
Member Author

@magdmartin on point 2 we do know what the FREE memory is...and taking 50% of that the first time they run OpenRefine shouldn't cause too many problems. Yeah, we can do preference setup.

@thadguidry
Copy link
Member Author

@jackyq2015 Can you confirm that the OSX options above work well enough for us to parse out the FREE memory ?

@jackyq2015
Copy link
Contributor

OXS don't have the free command. But I confirmed this works:
vm_stat | perl -ne '/page size of (\d+)/ and $size=$1; /Pages\s+([^:]+)[^\d]+(\d+)/ and printf("%-16s % 16.2f Mi\n", "$1:", $2 * $size / 1048576);'

Link:
https://apple.stackexchange.com/questions/4286/is-there-a-mac-os-x-terminal-version-of-the-free-command-in-linux-systems

@thadguidry
Copy link
Member Author

thadguidry commented Nov 5, 2017

@jackyq2015 hmm... there's a Perl dependency... can you confirm if this works also ?
top -l 1 | grep ^PhysMem gets us closer to that ?

@jackyq2015
Copy link
Contributor

Yes. it works:
$ top -l 1 | grep PhysMem PhysMem: 1473M used (439M wired), 4543M unused.
$ top -l 1 | grep PhysMem | awk '{print $6}' 4535M

@thadguidry thadguidry assigned jackyq2015 and unassigned thadguidry Nov 10, 2017
@thadguidry thadguidry added component-Scripts Good First Issue Indicates issues suitable for newcomers to design or coding, providing a gentle introduction. Type: Feature Request Identifies requests for new features or enhancements. These involve proposing new improvements. preferences Storage and editing of user preferences labels Nov 10, 2017
@jackyq2015
Copy link
Contributor

I just realised there is some issue if we want to implement this:

  1. it will automatically set the RAM for user which will disable the possibility to tune the JVM themselves. Unless you provide extra switch to skip the automatically setting or enforce the user setting. That defeats the purpose of the intention to ease the process.
  2. The calculated RAM size may too small to too big to have the application launched.
    For example 32 bit system allow max around 4G, but 64 bit system is different. Also the different JVM vendors have its limitation differently. There are lots of combination of OS, Arch and JVM, not even mention about the user preference.

I would suggest that we leave this to end user or only provide the suggestion instead of set it directly.

Thoughts?

@thadguidry
Copy link
Member Author

thadguidry commented Nov 10, 2017

@jackyq2015 I thought of that also. Let's change the need of this issue then to a notification only.

We inspect (using the 3 OS memory lookups that found to work well above) and then advise the user on startup... "Hey ! We noticed that you have 6 Gigs of Free Memory available....OpenRefine can run better when given more memory. Read our FAQ on how to allocate more memory here: blahURL"

@jackyq2015
Copy link
Contributor

sure. we can do that.

@thadguidry
Copy link
Member Author

This can even be a Javascript popup

@thadguidry thadguidry changed the title Add free memory detection and auto setting from batch file Add free memory detection and Notification to user Nov 10, 2017
jackyq2015 added a commit to jackyq2015/OpenRefine that referenced this issue Nov 10, 2017
jackyq2015 added a commit that referenced this issue Nov 10, 2017
message notification of the free RAM. Issue #1295
@thadguidry
Copy link
Member Author

@jackyq2015 Beautiful. Good Job.

D:\git\OpenRefine>refine
You have 12219M of free memory.
You current configuration will allow to use 1400M of memory.
OpenRefine can run better when given more memory. Read our FAQ on how to allocate more memory here:
https://github.com/OpenRefine/OpenRefine/wiki/FAQ:-Allocate-More-Memory
.
14:47:49.668 [ refine_server] Starting Server bound to '127.0.0.1:3333' (0ms)
14:47:49.670 [ refine_server] refine.memory size: 1400M JVM Max heap: 1407188992 (2ms)
14:47:49.682 [ refine_server] Initializing context: '/' from 'D:\git\OpenRefine\main\webapp' (12ms)

@jackyq2015 jackyq2015 modified the milestone: 2.8 Nov 18, 2017
jackyq2015 added a commit that referenced this issue Jan 15, 2018
message notification of the free RAM. Issue #1295
wetneb pushed a commit that referenced this issue Feb 2, 2018
* fix the appbundle issue #1209

* fix #1162

allow the JRE 9

* fix the package declarations

* remove the _ from the method name

* use the explicit scoping

* remote extra ;

* fix issued from codacy

* fix issued from codacy

* add preferences link to the index page

* handle the empty user metadata

* fix 'last modified' sorting issue #1307

* prevent overflow of the table. issue #1306

* add isoDateParser to sort the date

* prevent overflow of the project index

* remove sorter arrow for action columns

* disable editing the internal metadata

* adjust the width of the table

* change MetaData to Metadata

* change the filed name from rowNumber to rowCount

* put back the incidently deleted gitignore

* add double quote to prevent word splitting

* UI improvement on metadata view and project list view

* remove the date field in metadata

* message notification of the free RAM. Issue #1295

* UI tuning for metadata view

* shorten the ISO date to locale date format

* Added translation using Weblate (Portuguese (Brazil))

* remove the rename link

* Ignore empty language files introduced by Weblate

* Add UI for Invert text filter

* Backend support for Inverting Text search facets

* Fix reset on text search facet

* More succinct return statements

* add tests for SetProjectMetadataCommand

* Tidying up for Codacy

* Added Tests for TextSearchFilter

* Corrections for Codacy

* More code tidy up

* let the browser auto fit the table cell when resizing/zooming

* fix import multiple excel with mulitple sheets issue #1328

* check if the project has the userMetadata

* fix the unit test
support multi files with multi tables for open office

* prevent the same key for user metadata

* replace _ with variable for exception

* fix the no-undef issue

* to adjust the width of transform dialog. issue #1332

* fix the row count refresh issue

* extract method

* move the log message

* cosmatic changes for codacy

* fix typo

* bump to version 2.8

* .gitignore is now working

* preview stage won't have the metadata populated, so protect NPE

* Update README.md

No more direct link to the last version tag, which will avoid having to think of updating the readme

* refacotring the ProjectMetadata class

* introduce the IMetadata interface

* create submodule of dataschema

* add back

* setup lib for dataschema; upgrade the apache lang to lang3

* replace escape* functions from apache lang3

* replace the ProjectMetadata with IMetadata interface

* add missing jars

* set the IMetadata a field of Project

* remove PreferenceStore out of Project model

* fix test SetProjectMetadataCommandTests by casting

* introdcue the AbstractMetadata

* introdcue the AbstractMetadata

* reorganize the metadata package

* allow have mulitiple metadata for a project

* support for mulitple metadata format

* remove jdk7 since 'table schema' java implmentation only support jdk8+

* set execute permission for script

* fix the Unit Test after Metadata refactoring

* restore the apache lang2.5 since jetty 6.1.22 depend on it

* add commons lang 2.5 jar

* git submodule add  https://github.com/frictionlessdata/datapackage-java

* remove the metadata parameter from the ProjectManager.registerProject method

* remove hashmap _projectsMetadata field from the ProjectManager and FileProjectManager

* init the Project.metadataMap

* fix Unit Test

* restore the ProjectMetaData map to ProjectManager

* put the ProjectMetaDta in place for ProjectManager and Project object

* check null of singleton instead of create a constructor just for test

* load the data package metadata

* importing data package

* importing data package

* encapsulate the Package class into DataPackageMetadata

* user _ to indicate the class fields

* introduce base URL in order to download the data files

* import data package UI and draft backend

* import data package UI

* fix typo

* download the data set pointed from metadata resource

* save and load the data package metadata

* avoid magic string

* package cleanup

* set the java_version to 1.8

* set the min jdk to 1.8

* add the 3rd party src in the build.xml

* skip the file selection page if only 1 DATA file

* add files structure for json editor

* seperate out the metadata file from the retrival file list

* rename the OKF_METADATA to DATAPACKAGE_METADATA

* clean up

* implement GetMetadateCommand class

* display the metadata in json format

* git submodule update --remote --merge

* adjust the setting after pulling from datapackage origin

* fix the failed UT DateExtensionTests.testFetchCounts due to new json jar json-20160810.jar will complain: JSONObject["float"] not a string.

* clean up the weird loop array syntax get complained

* remove the unused constant

* export in data package format

* interface cleanup

* fix UT

* edit the metadata

* add UT for SetMetadataCommand

* fix UT for SetMetadataCommand

* display the data package metadata link on the project index page

* update submodule

* log the exceptions

* Ajv does not work properly, use the back end validation instead

* enable the validation for jsoneditor

* first draft of the data validation

* create a map to hold the constraint and its handler

* rename

* support for minLength and maxLength from spec

* add validate command

* test the opeation instead of validate command

* rename the UT

* format the error message and push to the report

* fix row number

* add resource bundle for validator

* inject the code of the constrains

* make the StrSubstitutor works

* extract the type and format information

* add the customizedFormat to interface to allow format properly

* get rid of magic string

* take care of missing parts of the data package

* implement RequiredConstraint

* patch for number type

* add max/min constraints

* get the constrains directly from field

* implement the PatternConstraint

* suppress warning

* fix the broken UT when expecting 2 digits fraction

* handle the cast and type properly

* fix the missing resource files for data package when run from command line

* use the copy instead of copydir

* add script for appveyor

* update script for appveyor

* do recursive clone

* correct the git url

* fix clone path

* clone folder option does not work

* will put another PR for this. delete for now

* revert the interface method name

* lazy loading the project data

* disable the validate menu for now

* add UT

* assert UTs

* add UT

* fix #1386

* remove import

* test the thread

* Revert "test the thread"

This reverts commit 7792141.

* fix the URLCachingTest UT

* define the template data package

* tidy up the metadata interface

* check the http response code

* fix the package

* display user friendly message when URL path is not reachable

* populate the data package schema

* Delete hs_err_pid15194.log

* populate data package info

* add username  preference and it will be pulled as the creator of the metadata

* undo the project.updateColumnChange() and start to introduce the fields into the existing core model

* tightly integrate the data package metadata

* tightly integrate the data package metadata for project level

* remove the submodule

* move the edit botton

* clean up build

* load the new property

* load the project metadata

* fix issues from codacy

* remove unused fields and annotation

* check the http response code firstly

* import zipped data package

* allow without keywords

* process the zip data package from url

* merge the tags

* check store firstly

* remove the table schema src

* move the json schema files to schema dir

* add comment

* add comment

* remove git moduels

* add incidently deleted file

* fix typo

* remove SetMetadataCommand

* revert change

* merge from master
@tfmorris
Copy link
Member

No one is ever going to see a couple of random lines of text at the beginning of the server log (which usually isn't even visible). It also doesn't appear to be accurate on OS X where it says:

You have 477M of free memory.
Your current configuration is set to use 1400M of memory.

when I have a couple of GB free even after Refine has started up and if I gave it more memory the operating system would just reclaim more memory from other processes (in other words "free memory" isn't really a good criterium to use).

GC overhead or heap usage might be better feedback for the user and, of course, any feedback needs to be presented through the web interface where they can see it.

@thadguidry
Copy link
Member Author

@tfmorris Yeap. Agree. The text only helped for a small number of users. Getting the web UI to present memory info would be really nice. (hmm, also of note, we get part of that for free now on the spark-prototype branch as part of the Spark's UI memory management at http://<driver>:4040)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Good First Issue Indicates issues suitable for newcomers to design or coding, providing a gentle introduction. preferences Storage and editing of user preferences Type: Feature Request Identifies requests for new features or enhancements. These involve proposing new improvements.
Projects
None yet
Development

No branches or pull requests

5 participants