Skip to content

The Msgs Sample Application (Spider)

Randy Guck edited this page Mar 12, 2015 · 7 revisions

[Table of Contents](https://github.com/dell-oss/Doradus/wiki/Spider Databases: Table-of-Contents) | [Previous](https://github.com/dell-oss/Doradus/wiki/When Spider Works Best) | [Next](https://github.com/dell-oss/Doradus/wiki/Spider Data Model)
Spider Database Overview: The Msgs Sample Application


To illustrate features, a sample application called `Msgs` is used. The `Msgs` application schema is depicted below:

The tables in the Msgs application are used as follows:

  • Message: Holds one object per sent email. Each object stores values such as Body, Subject, Size, and SendDate and links to Participant objects in three different roles: sender, internal recipient, and external recipient.

  • Participant: Represents a sender or receiver of the linked Message. Holds the ReceiptDate timestamp for that participant and links to identifying Person and Address objects.

  • Address: Stores each participant’s email address, a redundant link to Person, and a link to a Domain object.

  • Person: Stores person Directory Server properties such as a Name, Department, and Office.

  • Domain: Stores unique domain names such as "yahoo.com".

In the diagram, relationships are represented by their link name pairs with arrows pointing to each link’s extent. For example, ↓Sender is a link owned by Message, pointing to Participant, and MessageAsSender↑ is the inverse link of the same relationship. The Manager and DirectReports links form a reflexive relationship within Person, representing an org chart.

The schema for the Msgs application is shown below in JSON:

{"Msgs": {
    "key": "MsgsKey",
    "options": {
        "StorageService": "SpiderService",
        "AutoTables": "false"
    },
    "tables": {
        "Message": {
            "options": {
                "sharding-field": "SendDate",
                "sharding-granularity": "DAY",
                "sharding-start": "2010-07-17",
                "aging-field": "SendDate",
                "retention-age": "5 YEARS",
                "aging-check-frequency": "3 DAYS"
            },
            "fields": {
                "Body": {"type": "TEXT"},
                "Participants": {
                    "fields": {
                        "Sender": {"type": "LINK", "table": "Participant", "inverse": "MessageAsSender"},
                        "Recipients": {
                            "fields": {
                                "ExternalRecipients": {"type": "LINK", "table": "Participant", "inverse": "MessageAsExternalRecipient"},
                                    "InternalRecipients": {"type": "LINK", "table": "Participant", "inverse": "MessageAsInternalRecipient"}
                            }
                        }
                    }
                },
                "SendDate": {"type": "TIMESTAMP"},
                "Size": {"type": "INTEGER"},
                "Subject": {"type": "TEXT"},
                "Tags": {"collection": "true", "type": "TEXT"},
                "ThreadID": {"type": "TEXT", "analyzer": "OpaqueTextAnalyzer"}
            },
            "aliases": {
                "$SalesEmails": {"expression": "Sender.Person.WHERE(Department:Sales)"}
            }
        },
        "Participant": {
            "options": {
                "sharding-field": "ReceiptDate",
                "sharding-granularity": "DAY",
                "sharding-start": "2012-07-17"
            },
            "fields": {
                "MessageAddress": {"type": "LINK", "table": "Address", "inverse": "Messages"},
                "MessageAsExternalRecipient": {"type": "LINK", "table": "Message", "inverse": "ExternalRecipients"},
                "MessageAsInternalRecipient": {"type": "LINK", "table": "Message", "inverse": "InternalRecipients"},
                "MessageAsSender": {"type": "LINK", "table": "Message", "inverse": "Sender"},
                "Person": {"type": "LINK", "table": "Person", "inverse": "Messages"},
                "ReceiptDate": {"type": "TIMESTAMP"}
            }
        },
        "Address": {
            "fields": {
                "Domain": {"type": "LINK", "table": "Domain", "inverse": "Addresses"},
                "Messages": {"type": "LINK", "table": "Participant", "inverse": "MessageAddress", "sharded": "true"},
                "Name": {"type": "TEXT"},
                "Person": {"type": "LINK", "table": "Person", "inverse": "MessageAddresses"}
            }
        },
        "Person": {
            "fields": {
                "DirectReports": {"type": "LINK", "table": "Person", "inverse": "Manager"},
                "FirstName": {"type": "TEXT"},
                "LastName": {"type": "TEXT"},
                "Location": {
                    "fields": {
                        "Department": {"type": "TEXT"},
                        "Office": {"type": "TEXT"}
                    }
                },
                "Manager": {"type": "LINK", "table": "Person", "inverse": "DirectReports"},
                "MessageAddresses": {"type": "LINK", "table": "Address", "inverse": "Person"},
                "Messages": {"type": "LINK", "table": "Participant", "inverse": "Person"},
                "Name": {"type": "TEXT"}
            }
        },
        "Domain": {
            "fields": {
                "Addresses": {"type": "LINK", "table": "Address", "inverse": "Domain"},
                "InternalID": {"type": "text", "analyzer": "NullAnalyzer"},
                "IsInternal": {"type": "BOOLEAN"},
                "Name": {"type": "TEXT"}
            }
        }
    }
}}

Some highlights of this schema (details described later):

  • The application-level option StorageService explicitly assigns the application to the SpiderService. The AutoTables option is disabled.

  • The Message table uses these unique features:

  • Table-level sharding is enabled via the sharding-field, sharding-granularity, and sharding-start options, and automatic data aging is enabled via the aging-field and retention-age options. The data aging check frequency is set to 3 days.

  • A group field called Participants contains the link field Sender and a second-level group called Recipients, which contains the links InternalRecipients and ExternalRecipients. All three links point to the Participant table, allowing the Participants and Recipients group fields to be used in DQL quantifiers.

  • The ThreadID field is assigned the non-default OpaqueTextAnalyzer, which means it can only be searched for whole values (not terms).

  • An alias called $SalesEmails is assigned the expression Sender.Person.WHERE(Department:Sales). $SalesEmails is dynamically expanded when used in DQL queries.

  • The Participant table uses table-level sharding similarly as the Message table.

  • The Address table declares the Messages link as sharded to leverage its target table’s sharding.

  • The Domain table declares the InternalID field with the non-default NullAnalyzer. This means the InternalID can be returned in queries but is not indexed and cannot be searched.

Clone this wiki locally