Permalink
Browse files

exam notes

  • Loading branch information...
1 parent ff35a7c commit 9b5bc557d4f26e3e1f7ccdaece756a33e854c149 George Erickson committed May 22, 2012
Showing with 128 additions and 0 deletions.
  1. +41 −0 6.033/papers/object_store.md
  2. 0 6.033/papers/porcupine.md
  3. +87 −0 6.033/papers/unison.md
@@ -0,0 +1,41 @@
+#Object Store
+
+
+## Goals
+- make fetching persistant objects as fast as transient ones
+
+## Collections
+- Create a collection attr of an obj by setting a fields if type *os_Set*
+- A policy can be associated with a collection and ObjectStore transparently selects the most efficeinnt data structure for the application
+
+## Relationships
+- declare using *inverse_member*
+- supports 1-1, 1-many and many-many
+- These references are automatically updated to always be in sync
+
+## Queries
+- nesting supported
+- doesn't support full joins, does support semi-joins i.e. result of a query is a subset of the collection being queried.
+
+## Versioning
+- a user can check out a version of an object and check it back in at a later time
+- no concurrency issues because a user can simply checkout an alternate version
+- meging of versions is left up to the application
+
+## Implementation
+- Tag table - keeps track of type and location of every object
+- applications can ipmrove performance by clustering objects that are used together.
+ - the db is split into chunks called segments
+ - the app can choose what segment for objects to reside
+
+### Collections
+- os_collection - automatically chooses the proper subclass
+ - os_set
+ - os_bag
+ - os_list
+
+### Queries
+- join optimization easy
+- index maitenece hard
+ - require declaration of fields that may be indexible
+ - if these fields changed update index
No changes.
View
@@ -0,0 +1,87 @@
+# Unison
+## Design
+- users changes are sacred
+- user level program, syncs using the current state of the system
+ - not trace-based sync
+- portable across large number of OS
+- high performance syncing large (1GB) replicas
+- users should be able to predict is behavior in all situations
+- Only syncs whole files, conflict if the same file has been edited in two versions
+- Only supports pairwise sync
+
+## Architecture
+- client process started with 2 roots for sync
+- *update detection* performed separately by the client and server
+ - read a archive file created at the end of the last sync
+ - compare this with current state to figure out a list of paths that have changed
+ - a file is changed if it contents are different
+- *reconciliation* lists are merged to create a *task list*
+ - deletion is considered as a special kind of content, so it is treated the same as created and changed files
+ - conflicts - a path is in conflict if
+ 1. it has been updated in one replica
+ 2. it or any of its descendents have been updated in the other replica
+ 3. its contents in the in the two replicas are not identical
+ - this means that deleting a directory in one replica and changing a descendent of that directory in the other is a conflict
+- *confirmation* - task list displayed to user
+- *propogation* propogate the changes between replicas
+
+## Robustness
+### Crash resistance
+- At every moment during a run of unison, every file has either its original contents or its correct final contents
+- atomic file replacement
+ - the file to be replaced is renamed to a temp file
+ - the replacement file is renamed to the target file
+- this atomic file replacment required in two places
+ 1. updating a user file
+ - transfers new version over network to make temp file (not atomic)
+ 2. updating its archive
+ - could be interuptd after its updated user file but before archive updated
+ - this is ok
+
+### concurrent file sytem updates
+1. user may modify file after unison decides it should be replaced
+ - unison checks for this by refingerprinting a file immedietly before it writes over it
+2. user modifies file while unison is in the middle of transfering it
+
+### Update Detection
+- if a file is changed and unison does not detect this, data may be lost
+- modtimes
+ - not sufficient because they dont change when a file is renamed
+ - if A and B have same modtime and A is deleted and B is renamed to A, modtimes would show no change to A
+ - on Unix this can be detected with inodes, but not 100% because modtimes can be set back
+- only sure way is to compare the fingerprint of every file
+ - takes a long time
+ - for impatient users less-safe method included
+
+### Transfer Errors
+- checksums
+
+### Archive loss
+- user deletes archive file
+- behaves as though both replicas had been completely empty at the last sync
+
+### Error handling
+- Safe for the program to crash at anytime
+
+### Cross-platform sync
+1. Case sensitivity of file names
+ - treat both files sytems as case insensitive
+ - if finds two files with diff capitalization, displays warning that it cant be synced to windows
+2. File permissions
+ - UNIX
+ - file has an owner and a group
+ - can specify read/write/execute permissions sperately for all
+ - Windows
+ - no owners or group
+ - only read-only/write-only
+
+
+3. Symbolic Links
+4. Illegal file names
+5. modtime granularity
+6. Reconciling line endings
+7. mounted disks
+
+## Performance
+1. Threads
+2. Rsync

0 comments on commit 9b5bc55

Please sign in to comment.