Skip to content
This repository has been archived by the owner on Jun 20, 2023. It is now read-only.

Databases not being updated #51, possibly #11 #63

Merged
merged 10 commits into from
Oct 20, 2019

Conversation

adminrobert
Copy link
Contributor

This implements logic to pull only the most up to date database for a type. Each database is an entity with possible different states (cvd, cld, cud). In my findings ".cud" is used purely for unsigned databases, and should not actually be used when referring to the main items used by freshclam (main, daily, bytecode). The main motivation for not treating each database as a different file in every combination (daily.cvd, daily.cld) is databases can get rather large and the run context of lambda is 512mb. During testing if the scanner had access to both a cvd and a cld it would just ignore the older file.

@CLAassistant
Copy link

CLAassistant commented Jan 29, 2019

CLA assistant check
All committers have signed the CLA.

@adminrobert adminrobert changed the title Databases not being update #51, possibly #11 Databases not being updated #51, possibly #11 Jan 29, 2019
@adminrobert
Copy link
Contributor Author

I did not realize I also needed to make sure the local database is compressed (cld). I pulled in the config option from #62

@adminrobert
Copy link
Contributor Author

@ocasta points out my changes require an update to the policy described in the README.md that I did not realize as I had policies in place that already allowed the necessary access.

adminrobert@8958b59#commitcomment-32121132

The solution should be to just implement this using the s3_client just like md5 does: head_object

@andybkay
Copy link

I have tested this, and it doesn't appear to fix #51, the cld file is uploaded, but not the cvd file.

@adminrobert
Copy link
Contributor Author

@andybkay The CLD and CVD databases are exclusive. The CVD is pretty much only used on a fresh download. After the initial download, diffs are downloaded going forward and then applied to the CLD/CVD to create an updated CLD only. The process originally already only uploads changed files. During my testing in my production environment, before I had code in place to not download all of the files every time I would get a complaint from clamav that the CLD was newer than the CVD and that the CVD was going to be ignored.

Can you check to see if the CLD is being updated? If the CLD is updated then the CVD most likely will not receive updates. The reason I had to put this fix in, in the first place is because freshclam was downloading diffs creating a CLD, then discarding all of that work because it never gets merged back into the CVD (by their design). Should freshclam deem a new CVD necessary it will download it, and the process will upload it to the s3 bucket because the md5sum would have changed.

This process is working correctly in my production environment and clamav does not complain that my databases are out of date. If you verify that your CLD is being updated but clam is still complaining about database out of date, we may need more information from your setup. Please also make sure you have all of the latest changes from my branch as I had a number of commits.

@Cecarlego
Copy link

I've implemented @adminrobert changes into my environment, the updates run well and the scan does not complain about the databases being out of date anymore.
Thank you!

@wilqq
Copy link

wilqq commented Oct 16, 2019

@jaygorrell @chrisgilmerproj I think this project is very useful, I appreciate your work, I see there is a lot of new changes in the repository. Unfortunately, I encounter the problem described in this PR. I think it's very important, without updating viruses database this solution doesn't provide reliable scanning.

Is there any chance this PR will be merged in the near future?

@adminrobert thank you for this PR

@chrisgilmerproj
Copy link
Collaborator

@wilqq @adminrobert - I just joined the project so I'm catching up here. But let me take a look at this PR and bring it in line with master and we'll see what we can do.

@chrisgilmerproj
Copy link
Collaborator

@adminrobert - I've updated this PR to what's on master. I'm going to write a few tests to make sure this does what we expect. Then I'll give it a thumbs up for merging.

@chrisgilmerproj
Copy link
Collaborator

I'm pretty happy with the tests around update_defs_from_s3 now. I'm still working on tests for upload_defs_to_s3. I think once we have test coverage for both of them we can merge with confidence. I apologize for the big code push to this branch but hopefully its helpful.

@chrisgilmerproj
Copy link
Collaborator

@jaygorrell @denniswebb - I think this is actually good to go. The changes to upload_defs_to_s3 doesn't really modify anything but the filename. I've got to test this in AWS but then I think it ought to be good to go.

The only issue I see is that the license/cla needs to be signed by @adminrobert or we won't be able to merge.

@adminrobert
Copy link
Contributor Author

@chrisgilmerproj thanks for getting this in line with the main line. I had since lost access to the entire setup with which I wrote this. I signed the license. Let me know if there is anything else you need from me.

@jaygorrell
Copy link
Contributor

This is great! Thanks guys.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

9 participants