Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Need a list of all preferences that can be set via UI or openrefine.l4j.ini #1677

Closed
quirkiest opened this issue Jul 13, 2018 · 30 comments
Closed
Assignees
Labels
preferences Storage and editing of user preferences Type: Documentation Issues related to improving project documentation or tutorials.

Comments

@quirkiest
Copy link

I work with very large datasets (~20MM rows x 12 cols) and set cache & memory to take advantage of the big EC2 server I am using. Glad to see that in V3 this seems to be very easy.

I also set preferences:
-Drefine.data_dir=D:\OpenRefine\working
-Dui.browsing.listFacet.limit=5000

I'd love to know what the range of prefs actually are - what is available? For example I set display rows default via data-table-view.js but if it were a preference that'd make life much easier.

@thadguidry thadguidry added this to the 4.0 Beauty Queen milestone Jul 18, 2018
@thadguidry thadguidry added preferences Storage and editing of user preferences Type: Feature Request Identifies requests for new features or enhancements. These involve proposing new improvements. labels Jul 18, 2018
@thadguidry
Copy link
Member

thadguidry commented Jul 18, 2018

JVM preferences are many... refer to official Java documentation for those.

For OpenRefine preferences...
Unfortunately, we were not consistent in how we wired up the PreferenceStore. But if you look in each of these files from this search, you will see many of them, like the listFacet.limit you found:

https://github.com/OpenRefine/OpenRefine/search?p=2&q=getPreferenceStore%28%29&unscoped_q=getPreferenceStore%28%29

TODO:

Expressions (Json Array)

Language

  • userLang (Default is en for English)

Facets

  • ui.browsing.listFacet.limit
  • ui.browsing.pageSize

Google Drive Timeouts

  • googleConnectTimeOut
  • googleReadTimeOut

Export Template

  • exporters.templating.template

Reconciliation (Json Array)

  • reconciliation.standardServices

Metadata (Json Array)

  • USER_NAME
  • projectName
  • projectTags
  • title
  • homepage
  • image
  • license
  • encoding
  • userMetadata
  • customMetadataColumns
  • ??? @jackyq2015 will have to document these.

Wikidata (Json Array)

  • "wikidata_credentials":["username", "password"]

@thadguidry
Copy link
Member

thadguidry commented Feb 27, 2019

Another added today with #1959

Show reconciliation previews on hover of reconciliation candidates

  • cell-ui.previewMatchedCells=false

@wetneb
Copy link
Sponsor Member

wetneb commented Feb 27, 2019

Actually it's cell-ui.previewMatchedCells=false, my commit message was wrong…

@thadguidry
Copy link
Member

fixed above

@quirkiest
Copy link
Author

quirkiest commented Feb 28, 2019 via email

@wetneb wetneb added Type: Documentation Issues related to improving project documentation or tutorials. and removed Type: Feature Request Identifies requests for new features or enhancements. These involve proposing new improvements. labels Dec 20, 2019
@wetneb wetneb removed this from the 4.0 milestone Dec 20, 2019
@thadguidry thadguidry added this to TODO in UI Improvements Mar 13, 2020
@ostephens ostephens self-assigned this Jun 23, 2020
@antoine2711
Copy link
Member

Language

  • userLang (Default is UTF-8)

@thadguidry: is it possible you meant: « en » (english) as the default language?

Regards,
Antoine

@thadguidry
Copy link
Member

thadguidry commented Jun 23, 2020

@antoine2711 Yes, sorry, that should be en (was found as a change here )

Also, stay tuned... @ostephens is going to update this issue with a complete list of preferences (since @allanaaa asked for this so we can fully document them) And where the hope is that all the preferences can all be shown and exposed with a better UI concept in Issue #2796 If you can work on that #2796 and give us a prototype over the next month or two or three that would be amazing!

@tfmorris
Copy link
Member

There's a misconception in the original post @quirkiest - giving the JVM an argument of -Dui.browsing.listFacet.limit=5000 has no effect. The OpenRefine preference store is disjoint from Java properties.

The JVM defined settings like refine.autosave and refine.data_dir are another category of things that need to be documented. Here's the current list:

refine.autoreload - false
refine.autosave -  5 (minutes)
refine.connection.max_idle_time - 60000
refine.context_path - /
refine.data_dir
refine.development - false
refine.headless - false
refine.host - 127.0.0.1
refine.max_form_content_size - 1048576
refine.memory - 1400 MB (set at JVM startup, so informational only)
refine.port - 3333
refine.queue.idle_time - 60 
refine.queue.max_size - 300
refine.queue.size - 30
refine.scanner.period - 1
refine.verbosity - info
refine.webapp - main/webapp

Some of these are (semi)-documented in the startup scripts, which also sometimes have different defaults, although some things there, like refine.google_api_key are obsolete.

@allanaaa
Copy link
Contributor

allanaaa commented Jul 7, 2020

I am just starting to look at Preferences and I see that refine.data_dir is referred to as a JVM defined setting in this thread.

In reference to the instructions given at https://github.com/OpenRefine/OpenRefine/wiki/FAQ#how-do-i-change-the-workspace-directory-that-i-want-refine-to-use-for-its-project-storage-

Alternatively, you can update and add a preference at http://127.0.0.1:3333/preferences ,
KEY = refine.data_dir
VALUE = T:\MyOpenRefineDataFolder

Can you or can you not set refine.data_dir on the Preferences page as suggested? I tested it myself and it has no effect, but I want to double-check.
For a second I was concerned this was an OS-related difference but that doesn't make sense, does it? :)

@tfmorris
Copy link
Member

tfmorris commented Jul 7, 2020

Can you or can you not set refine.data_dir on the Preferences page as suggested?

Can not. Preferences are stored in the workspace directory, so it's already been located by the time preferences are loaded. I've removed the erroneous text from the wiki.

@wetneb
Copy link
Sponsor Member

wetneb commented Jul 7, 2020

It will be great to have a clear documentation to explain the difference between preferences and server settings.

@allanaaa
Copy link
Contributor

I've added a table to the Wiki page to figure out precisely what preferences information we should provide to users.

Just now, during testing, setting ui.browsing.pageSize has no effect. I set it a few values (100, 65, 12, etc.) but didn't see any result.
I tried the former value (ui.gridPaginationSize) as well and didn't see any result. (From reading #2817.)

@wetneb
Copy link
Sponsor Member

wetneb commented Jul 20, 2020

We should document at which version these options have been added: for the one you mention, 3.5.

@ostephens
Copy link
Sponsor Member

ostephens commented Jul 21, 2020

The only preferences that will be shown in the "Preferences" screen in the UI are those that are not project dependent and that have values that are:

  • null
  • String
  • Number
  • Boolean

I think this means the only preferences that can be set and then viewed via the UI are:

Core product

  • userLang (String, needs to be a valid language code, defaults to en if not set)
  • ui.browsing.listFacet.limit (Integer, default of 2000 used preference it isn't set)
  • ui.browsing.pageSize (a string of the form "[a, b, c, ...]" where each of a, b, c is an integer not below 1 and not above 10000. There must be at least two integers inside the square brackets.) EDIT 2020-07-21T18:53Z - clarifying further (see comments below)
  • cell-ui.previewMatchedCells (String, true or false)

GData extension

  • googleConnectTimeOut
  • googleReadTimeOut

Wikidata extension

  • wikibase.upload.maxLag (defaults of 5 used if preference isn't set)

While you can set a preference with any name from the Preferences screen in the UI, if you try to set something that can't be read appropriately then you will break stuff! So for example if you set scripting.starred-expressions via the UI you'll break the starred expressions (because you end up setting a string, which can't be read appropriately as the type of object expected to be stored in that preference).

@tfmorris
Copy link
Member

While you can set a preference with any name from the Preferences screen in the UI, if you try to set something that can't be read appropriately then you will break stuff! So for example if you set scripting.starred-expressions via the UI you'll break the starred expressions

That sounds like a bug to me. The user shouldn't be able to corrupt their preferences.

@antoine2711
Copy link
Member

antoine2711 commented Jul 21, 2020

  • ui.browsing.pageSize (Integer, from 3.5 onwards only)

@ostephens: I would respectfully contradict you here: ui.browsing.pageSize must be an array of at least 2 integers, each integer not below 1 and not above 10000.

Regards,
Antoine

@ostephens
Copy link
Sponsor Member

@antoine2711 - ah thanks. I'm afraid I'd completely misconstrued the purpose of this and had not gone back and looked at the code - entirely my fault. This means it won't be settable via the Preferences page in the UI as it stands, so I've struck it through in the list above

@ostephens
Copy link
Sponsor Member

@tfmorris agreed ... but the whole thing is a mess to be honest.

I can create an issue for this bug, but I think that a re-think of how these 'settable' preferences work might be a better idea?

@antoine2711
Copy link
Member

antoine2711 commented Jul 21, 2020

@antoine2711 - ah thanks. I'm afraid I'd completely misconstrued the purpose of this and had not gone back and looked at the code - entirely my fault. This means it won't be settable via the Preferences page in the UI as it stands, so I've struck it through in the list above

Actually @ostephens, this integer array is expressed as a string. So a correct entry would be: [50, 100, 500, 1000], as a string.

I will clarify that in the documentation.

Regards,
Antoine

@ostephens
Copy link
Sponsor Member

@antoine2711 is there a maximum number of integers in the array?

@antoine2711
Copy link
Member

antoine2711 commented Jul 21, 2020

@antoine2711 is there a maximum number of integers in the array?

@ostephens: no. Now that you mention it, the array should be trunked at 10 items, I guess.

Regards, A.

@allanaaa
Copy link
Contributor

allanaaa commented Aug 6, 2020

Is there a way to set "show null values in cells" as an install-wide preference, like, by default on for all new projects? I would love that.

@thadguidry
Copy link
Member

@allanaaa Yes, this is an option to toggle under All menu (1st column)

@tfmorris
Copy link
Member

tfmorris commented Aug 7, 2020

Yes, this is an option to toggle under All menu (1st column)

What is the name of the key for that preference setting? I don't remember coming across anything like this.

@thadguidry
Copy link
Member

thadguidry commented Aug 7, 2020

@ostephens Do you recall what the preference setting is for show nulls?

@tfmorris Found the work he did in PR #1571 and #1544 and then around those summer months in 2018 Owen did lots of other null handling fixes on other commits if you look.

@StoltHD
Copy link

StoltHD commented Aug 7, 2020

So where do we (the none java programmer USERS) actually set the working directory when running on Windows?

AND where do we actually set the amount of memory that ACTUALLY get used.

I have tried to set it in the "refine.ini", in the "openrefine.l4j.ini", in the "refine.bat", none of it works...

So can you please make some minimal documentation that actually do reference the correct way to set the most important settings BEFORE we start Openrefine...

Openrefine are a great tool, but seriously its at mess to try to get it to utilize the computer hardware correct...
i.e. I have 64GB of RAM a raid0 of 4 fast nvme's for data, 12 cores 24 threads... and I can't get Openrefine to utilize anything more than a few GB of memory,

But I have more or less given up on Openrefine and gone back to using Excel 2016 and Power Query, because I can't find any documentation that give me information on settings and configureations for Openrefine that actually works...

Where ever I set the "Drefine.data_dir= {Some folder}", Openrefine falls back to "\AppData\Roaming\OpenRefine"

@thadguidry
Copy link
Member

thadguidry commented Aug 7, 2020

@StoltHD We have all of this documented in our Wiki pages here on GitHub https://github.com/openrefine/openrefine/wiki and specifically about memory settings here https://github.com/OpenRefine/OpenRefine/wiki/FAQ-Allocate-More-Memory

You can also take a look at our "work in progress" docs to see if that also helps you a bit more specifically about running OpenRefine https://docs.openrefine.org/manual/running

I can tell you are frustrated, hang on there for just a bit, so I'd advise looking at those things above, and then hopping onto our Gitter chat for direct support or our mailing list. Isn't that great that you get free support at any time from us? :-) Try that with Excel

My refine.ini file looks like:

# NOTE: This file is not read if you run the Refine executable directly
# It is only read of you use the refine shell script or refine.bat

# FOR DEVELOPERS: you can copy refine.ini and rename it to refine-dev.ini
# Configurations in refine.ini will be ignored if refine-dev.ini exists
# refine-dev.ini won't be tracked by Git, so feel free to put your custom configurations in it

no_proxy="localhost,127.0.0.1"
#REFINE_PORT=3334
#REFINE_HOST=127.0.0.1
#REFINE_WEBAPP=main\webapp

# Memory and max form size allocations
#REFINE_MAX_FORM_CONTENT_SIZE=1048576
REFINE_MEMORY=4000M

# Set initial java heap space (default: 256M) for better performance with large datasets
REFINE_MIN_MEMORY=1400M

# Some sample configurations. These have no defaults.
#JAVA_HOME=C:\Program Files\Java\jdk1.8.0_151
# Use a single JAVA_OPTIONS that includes any JVM options you need upon OpenRefine startup
JAVA_OPTIONS=-Drefine.data_dir=E:\openrefine_data

# Uncomment to increase autosave period to 60 mins (default: 5 minutes) for better performance of long-lasting transformations
#REFINE_AUTOSAVE_PERIOD=60

# OAuth credentials configurations for Google Data
# Default OAuth credentials for Google Data have been embedded in the release version of OpenRefine
# So if you are a user, you can just skip these configurations, but it's recommended to use your own credentials
# If you are a developer, you'll need to acquire them by yourself
# To get your own credentials, please see the wiki: https://github.com/OpenRefine/OpenRefine/wiki/Google-Extension
# The wiki will guide you to get a client_id/client_secret pair
# After getting the client_id and client_secret, put them below
#GDATA_CLIENT_ID=your_client_id
#GDATA_CLIENT_SECRET=your_client_secret

@ostephens
Copy link
Sponsor Member

Is there a way to set "show null values in cells" as an install-wide preference, like, by default on for all new projects? I would love that.

Yes, this is an option to toggle under All menu (1st column)

What is the name of the key for that preference setting? I don't remember coming across anything like this.

This isn't a preference setting. Selection of the menu item just toggles the styling of a span in the cell

@thadguidry
Copy link
Member

@ostephens Thanks Owen, then in that case, I guess #3058 will be useful for Allana and others.

@wetneb
Copy link
Sponsor Member

wetneb commented May 9, 2021

This is now documented here: https://docs.openrefine.org/manual/running

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
preferences Storage and editing of user preferences Type: Documentation Issues related to improving project documentation or tutorials.
Projects
No open projects
Development

No branches or pull requests

8 participants