Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RF: return values (#1350 continued) #1409

Merged
merged 245 commits into from
May 6, 2017
Merged

RF: return values (#1350 continued) #1409

merged 245 commits into from
May 6, 2017

Conversation

mih
Copy link
Member

@mih mih commented Mar 21, 2017

Basic idea: commands yield report dictionaries. And wrappers act upon this information in whatever desired way.

Sits on top of #1348. Present tail of the procedure is in #1409

Change summary:

  • global command line switch to select output format
  • line-based 'json' output format (like git-annex)
  • 'simple' output format for "STATUS: PATH" lines, where PATH is relative whenever a reference dataset is available
  • supported commands return generators that are actually executed in main.py
  • centralized error detection that guarantees an exception if any error occured
  • BF: update did not fail on unavailable paths as argument
  • customize exception generation to be less dumb
  • centralize logging by shipping messages in the status dicts (actually: message template, variable, logger instance or func)
  • reimplement forced 'cmdline' rendering of API_ function with the new facilities to get rid of these shadow functions
  • convert actual return value nature on a by-call basis. Standard should be json dict sequence/generator.
  • provide return_value helper module with:
    - function to give a default dict
    - function to convert git-annex json record into our own structure
  • implement new constraint to test whether a dict has a key with a certain value
  • implement general command line options to filter results by type and by status (using new constraint above)
  • implement new general kwarg for all commands to accept a filter function for results (to fit a passed (compound) constraint) (name: filter_results)
  • implement new general kwarg for all commands to switch return values from generator to sequence (spec: return_type, {'generator', 'list', 'item-or-list'})
  • implement flexible transformation of return values from result dicts to other types (like Datasets)
  • implement general --on-failure switch (fixes New general option on_failure to indicate action when something goes wrong #819)
  • new test helpers: assert_status, assert_message, assert_in_results, assert_not_in_results, assert_result_count, assert_fields_equal
  • Fate and scope of "result summaries" #1380 support for result summaries

Commands converted:

  • create (possibly TODO: report addition to a superdataset)
  • install -- RF'ed to remove redundant code; split core functionality into a clone command (fully converted); high-level condition checking in install itself still needs conversion
  • get
  • add (complicated due to the old-school recursion in helper functions, needs substantial RF)
  • publish (seems straightforward, but has quite some diff to master, not attempted)
  • uninstall (needs additional RF, PY2 version of Ben's class detector cannot deal with more than one class per module)
  • drop (needs additional RF, PY2 version of Ben's class detector cannot deal with more than one class per module)
  • remove (needs additional RF, PY2 version of Ben's class detector cannot deal with more than one class per module)
  • update (cmd itself is done, but needs to pass though results from get)
  • create-sibling (seems doable, unclear if RF needed to pass add-sibling results through)
  • create-sibling-github (doable)
  • add-sibling (better waits till after general RF; convoluted already, calling for sibling command)
  • rewrite-urls (does not seem to be used anywhere -> low priority; but otherwise simple)
  • unlock
  • save
  • export (seems doable, but also rather useless)
  • search (will not be attempted, waiting for metadata RF)
  • aggregate-metadata (will not be attempted, waiting for metadata RF)
  • test (not needed)
  • crawl, crawl-init (no idea if it is feasible to attempt this, better be done by Yarik, also crawl-init about to be gone RF: move crawl-init into crawl making crawl accept additional params to be passed into pipeline #1155)
  • ls (will not be attempted, does not fit paradigm as of now, but could become relevant after RF)
  • clean (looks straightforward)
  • add-archive-content
  • download-url (seems straightforward, custom stop_on_failure needs to be RF'ed away)
  • create-test-dataset
  • sshrun (out of scope, needs to stay as it is)

Copied from #1468:

…eption when asked to commit files unknown to git (Closes #1291)
ENH: provide 'files' parameter for commit
ENH: Don't log git commit call failures at ERROR level
ENH: enable ok_clean_git's untracked_files feature for any AnnexRepo
BF: AnnexRepo.untracked_files does what is availabe, if annex-status-bug is preventing regular call
BF: Workaround for AnnexRepo.add(git=True) in v6. Ugly but enables us to see behind test failures, caused by this one.
TST: Several v6 fixes in test_annexrepo.py
Additional notes and minor changes
BF: Again, don't use annex-proxy for committing in direct mode, since it fails, when a passed path is a submodule (in direct mode)
Additional notes and minor corrections
Disabling the lazy loading of Dataset.repo, since it conflicts with flyweight implementation.

Both needs further discussion. See issue #1345
mih added 3 commits April 24, 2017 14:41
Necessary to support frequently employed calls without explicit kwarg
specification.
This is a rough approximation of the cases that I can see. I am sure
there are bits missing. However, the actual cases of install calls
that we already had in the tests and that always wanted this feature
seem to be happy with this draft.
mih added 12 commits April 26, 2017 14:52
…gh-1474)

If one exists. Previously it would use the local path inside the
superdataset that it would add to. This ensure that, once uninstall,
the dataset cannot be reobtained via a `get` call. Not optimal.
Inline with gh-1476 the default renderer should be something that is
uniform across commands and not something that is tailored.
It may be a 'clone' command that performs an actions, but a user is
actually performing and expecting an 'install'
Now reports on directories and files in git too. This is a step towards
only reporting results matching actual requests, and not end up with no
apparent result despite a specific request.
@yarikoptic
Copy link
Member

yarikoptic commented May 3, 2017

conflicts are because of me I guess so I will take a look now [done]

* origin/master: (22 commits)
  RF: unify log line for publishing data.. commented out initial sketch to properly handle decision to try to copy data
  ENH: Q&D tool to transfer urls from datalad to web remote
  BF: try to ignore .this-is-a-test-key files annex might leave behind ruining whereis command
  BF(minor): early and correct report on mispecified output kward for whereis
  BF(TEMP): publish - sort datasets in reverse order by path (Closes #1479)
  BF(TEMP): publish - sort datasets in reverse order by path (Closes #1479)
  BF/RF: remove wrapping with '' of func node str args
  BF(TST): skip test if no creds for s3://openneuro
  BF: add archive content -- should check for existing based on full path
  BF: rmtree -- also use unlink when trying to remove single file
  RF: make unlinking in add_url_to_file optional (unlink_existing)
  BF: publish data only if annex
  BF: publish data first where needed/possible (Closes #1480)
  BF: do not rely on diff but comparing commits if no paths restriction. Allow to --force publish
  BF: (re)add (remove/overwrite) file from within archive under annex, if already exists
  ENH: skip_if_url_is_not_available and skip test needing NITRC (Closes #1472)
  BF: we should verify versioned url, not original
  Attempt to get PY2 test going again (fixes gh-1471)
  BF: s3 versioned urls assert that the latest version before filtering out delete markers
  ENH: func nodes should get beter description in logs
  ...
@mih
Copy link
Member Author

mih commented May 3, 2017

Merci!

now page is back but not functioning properly -- requires auth etc
@mih
Copy link
Member Author

mih commented May 4, 2017

FTR: #1476 is a candidate for merge in here.

I will not touch publish in this PR: version in master is substantially different.

@mih
Copy link
Member Author

mih commented May 5, 2017

FTR: Test failures are a timeout, and this (no idea):

======================================================================
FAIL: datalad.tests.test_tests_utils.test_get_resolved_values
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/travis/virtualenv/python2.7.12/lib/python2.7/site-packages/nose/case.py", line 197, in runTest
    self.test(*self.arg)
  File "/home/travis/build/datalad/datalad/datalad/tests/test_tests_utils.py", line 145, in test_get_resolved_values
    eq_(flavors, _get_resolved_flavors(flavors))
AssertionError: ['networkish', 'local'] != ['local']

* origin/master: (27 commits)
  BF(TST): fixed up test for recent changes to the hook
  RF: turn js minimization back on
  BF(workaround): in recursive ls json mode do not return full nested list of node, just generate their jsons
  BF+TST: tried to understand + fixup js tests... more to grasp
  ENH: just converted tabs to spaces and some minor reformatting in test.html
  BF: commented out ds.submodules use with a note
  RF: use href= instead of .assign since some folks recommended to reflect navigation in history
  ENH: we need to store either we caching for parent dir
  fighting the windmills
  ENH: cache persistently within session, not only type, but also metadata path, and ds url
  ENH: allow to start previously stopped container for webserver
  BF: Fixed up for trailing / and path to the installed subdatasets to get breadcrubs coloring corrected
  RF: all -> all_, long_ -> long even at the level of API
  BF: update submodule upon publish/push within hook
  BF+ENH: pickup description for the datasets, enhance hook script
  BF: determining path to the web metadata file
  BF: Allow nodeJson to return complete metadata json
  RF: Remove scaffolding json param from main funcs
  ENH: Update crumb spans/colors based on their type
  BF: Return default type outside root ds boundary
  ...

 Conflicts:
	datalad/interface/ls.py -- maintained "master" way of getting relpaths, and then erecting a dataset
@yarikoptic yarikoptic merged commit 4b718ba into master May 6, 2017
@mih mih mentioned this pull request May 7, 2017
5 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants