New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Removing createMetaDB and adding docs to file indexing function #287

Merged
merged 1 commit into from Dec 6, 2018

Conversation

Projects
None yet
2 participants
@ethack
Copy link
Collaborator

ethack commented Dec 6, 2018

Description

This change removes the createMetaDB function and related code. The rationale is discussed in the issue this closes.

Additionally, comments are added to a function whose name is easily confused with unrelated Mongo terminology.

Closes #286

Manual Testing

First, I started with a clean database by dropping MetaDatabase.

# in mongo shell
> db.dropDatabase()
{ "dropped" : "MetaDatabase", "ok" : 1 }

Next, I imported a sample dataset and verified that it was populated and MetaDatabase was created.

$ ./rita import sample-data/bro/dnscat dnscat
[+] Importing sample-data/bro/dnscat
	[-] Finding files to parse
...
	[-] Indexing log entries. This may take a while.
$ rita show-databases
dnscat
# in mongo shell
> show dbs
MetaDatabase  0.000GB
admin         0.000GB
config        0.000GB
dnscat        0.122GB
local         0.000GB
rita-bl       0.009GB
> use MetaDatabase
switched to db MetaDatabase
> show collections
databases
files
logs
> db.databases.find()
{ "_id" : ObjectId("5c086bc9ca8a551b0f610044"), "name" : "dnscat", "import_finished" : true, "analyzed" : false, "import_version" : "v1.1.1", "analyze_version" : "" }
> db.files.find()
{ "_id" : ObjectId("5c086bcfca8a551b0f6b8f8e"), "filepath" : "sample-data/bro/dnscat/conn.00:00:00-01:00:00.log.gz", "length" : NumberLong(667023), "modified" : ISODate("2018-11-30T16:50:12.216Z"), "hash" : "907d5fd5921b0f4f565e38c45befdb7f", "collection" : "conn", "database" : "dnscat", "time_complete" : ISODate("2018-12-06T00:22:34.801Z") }
...

There were no error messages (only info and normal warnings about unrecognized parse files).

Next, I tried re-importing the same files into the same database.

./rita import sample-data/bro/dnscat dnscat
[+] Importing sample-data/bro/dnscat
	[-] Finding files to parse
	[-] Indexing log entries. This may take a while.

As expected, this did not find any files. Additionally, there were warnings that stated the following for each file:

time="2018-12-05T18:28:36-06:00" level=warning msg="Refusing to import file into the same database twice" path="sample-data/bro/dnscat/conn.00:00:00-01:00:00.log.gz" target_database=dnscat

Deleting the database also functioned as expected.

./rita delete-database dnscat
Are you sure you want to delete database dnscat [y/N] y
Deleting database: dnscat
# in mongo shell
> use MetaDatabase
switched to db MetaDatabase
> db.files.find()
> db.databases.find()

Here is a comparison that shows the expected result of the MetaDatabase indexes created with a prior version of RITA vs. the version in this pull request.

v1.1.1

> use MetaDatabase
switched to db MetaDatabase
> db.databases.getIndexes()
[
	{
		"v" : 2,
		"key" : {
			"_id" : 1
		},
		"name" : "_id_",
		"ns" : "MetaDatabase.databases"
	},
	{
		"v" : 2,
		"unique" : true,
		"key" : {
			"name" : 1
		},
		"name" : "nameindex",
		"ns" : "MetaDatabase.databases",
		"background" : true
	}
]
> db.files.getIndexes()
[
	{
		"v" : 2,
		"key" : {
			"_id" : 1
		},
		"name" : "_id_",
		"ns" : "MetaDatabase.files"
	},
	{
		"v" : 2,
		"unique" : true,
		"key" : {
			"hash" : 1,
			"database" : 1
		},
		"name" : "hashindex",
		"ns" : "MetaDatabase.files",
		"background" : true
	}
]

this pull request

> use MetaDatabase
switched to db MetaDatabase
> db.databases.getIndexes()
[
	{
		"v" : 2,
		"key" : {
			"_id" : 1
		},
		"name" : "_id_",
		"ns" : "MetaDatabase.databases"
	}
]
> db.files.getIndexes()
[
	{
		"v" : 2,
		"key" : {
			"_id" : 1
		},
		"name" : "_id_",
		"ns" : "MetaDatabase.files"
	}
]
@@ -134,11 +130,6 @@ func (m *MetaDB) DeleteDB(name string) error {
return err
}

_, err = ssn.DB(m.config.S.Bro.MetaDB).C("files").RemoveAll(bson.M{"database": name})

This comment has been minimized.

@ethack

ethack Dec 6, 2018

Collaborator

This chunk was doing the exact same thing as the few lines above it, with the difference that "files" is hard-coded here instead of using the value from the config as above.

//removeOldFilesFromIndex checks all indexedFiles passed in to ensure
//that they have not previously been imported into the same database.
//The files are compared based on their hashes (md5 of first 15000 bytes)
//and the database they are slated to be imported into.
func removeOldFilesFromIndex(indexedFiles []*fpt.IndexedFile,

This comment has been minimized.

@ethack

ethack Dec 6, 2018

Collaborator

The name of this function is a little ambiguous since it's not clear what makes files "Old" and "Index" is also used in Mongo terminology but that is not what is being referred to here. The comment hopefully helps clear this up.

@lisaSW

lisaSW approved these changes Dec 6, 2018

Copy link
Collaborator

lisaSW left a comment

Looks great!

@ethack ethack merged commit 53d435f into master Dec 6, 2018

2 checks passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details
continuous-integration/travis-ci/push The Travis CI build passed
Details

@ethack ethack deleted the 286-fix-metadb-init branch Dec 10, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment