New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MongoDB GridFS support #296

Merged
merged 7 commits into from Jul 17, 2018

Conversation

Projects
None yet
3 participants
@razims
Copy link
Contributor

razims commented May 23, 2018

No description provided.

@SebastianStehle
Copy link
Contributor

SebastianStehle left a comment

Thanks a lot, I have added some comments. My main question is: Why do you need a local folder?

}
}

public Task UploadAsync(string name, Stream stream, CancellationToken ct = default(CancellationToken))

This comment has been minimized.

@SebastianStehle

SebastianStehle May 23, 2018

Contributor

The code convention right now is not to use expression bodies. It is arguable, but I would like to keep it consistent.

This comment has been minimized.

@razims

razims May 23, 2018

Author Contributor

Your welcome Sebastian. Local folder is needed for two reasons:

  1. i have to provide the url to file (GenerateSourceUrl(string id, long version, string suffix)).
  2. performance

This comment has been minimized.

@SebastianStehle

SebastianStehle May 23, 2018

Contributor

Why? What are the cleanup rules? When do we delete them? What happens when you have multiple nodes? What is the advantage over simple folders?

GenerateSourceUrl is not that important. We can just return "UNSUPPORTED". The main reason is for debugging.

This comment has been minimized.

@SebastianStehle

SebastianStehle May 23, 2018

Contributor

If you want to keep a local copy I think it would make sense to make something like an "Aggregate" Asset store. Like a Multi Level Cache.

This comment has been minimized.

@razims

razims May 23, 2018

Author Contributor

expression bodies -> as you want Sir :)
GenerateSourceUrl -> haven't digged too much into logic, if you say so then fine

This comment has been minimized.

@SebastianStehle

SebastianStehle May 23, 2018

Contributor

I see. Can we make a ComposeAssetStore for that, who accepts a IEnumerable ... you could the wire it up with MemoryCacheAssetStore, FolderAssetStore, GridFSAssetStore? Right now the behavior is a little bit unexpected, at least it was for me.

This comment has been minimized.

@razims

razims May 23, 2018

Author Contributor

sounds good

This comment has been minimized.

@razims

razims May 23, 2018

Author Contributor

May be IFileProvider abstraction would be better than IEnumerable. What you think? This would allow to store nested way too as bonus, if required of course.

https://docs.microsoft.com/en-us/aspnet/core/fundamentals/file-providers?view=aspnetcore-2.1#file-provider-abstractions

This comment has been minimized.

@SebastianStehle

SebastianStehle May 23, 2018

Contributor

The IFileProvider is readonly, unfortunately. I mean something like this: https://github.com/aspnet/FileSystem/blob/dev/src/FS.Composite/CompositeFileProvider.cs

This comment has been minimized.

@razims

razims May 23, 2018

Author Contributor

CompositeFileProvider is just a set of IFileProviders. For reading fine

{
try
{
var filter =

This comment has been minimized.

@SebastianStehle

SebastianStehle May 23, 2018

Contributor

Is there no delete without quering first? You also ignore the version and suffix here.

This comment has been minimized.

@razims

razims May 23, 2018

Author Contributor

You are right, delete from local then from remote

This comment has been minimized.

@razims

razims May 23, 2018

Author Contributor

should i push a change here ?

This comment has been minimized.

@SebastianStehle

SebastianStehle May 23, 2018

Contributor

Would be great.

public async Task DownloadAsync(string id, long version, string suffix, Stream stream,
CancellationToken ct = default(CancellationToken))
{
var file = GetFile(id, version, suffix);

This comment has been minimized.

@SebastianStehle

SebastianStehle May 23, 2018

Contributor

Why do you need a local file?

This comment has been minimized.

@razims

razims May 23, 2018

Author Contributor

performance

This comment has been minimized.

@SebastianStehle

SebastianStehle May 23, 2018

Contributor

As I said: if you want to improve performance it would make more sense to implement something like a CachingAssetStore that has an inner IAssetStore. Because you can use the caching for all other storages as well then. And if you want to cache it you have also to clean it up. Otherwise you could have a lot of files and you have no idea where all your disk space is gone.

This comment has been minimized.

@razims

razims May 23, 2018

Author Contributor

CachingAssetStore -> agree

This comment has been minimized.

@razims

razims May 23, 2018

Author Contributor

where you want to cache the files? Memory?? You'll cache them on disk anyway, locally

This comment has been minimized.

@SebastianStehle

SebastianStehle May 23, 2018

Contributor

Depends how much cache levels you want.

}
}

public string GenerateSourceUrl(string id, long version, string suffix)

This comment has been minimized.

@SebastianStehle

SebastianStehle May 23, 2018

Contributor

This method must only be implemented if there is a URL where or path where I can actually download the asset from in a permanent way. In this case it is not possible, but it is fine.

This comment has been minimized.

@SebastianStehle

SebastianStehle May 23, 2018

Contributor

Just return NotSupportedException or so.

This comment has been minimized.

@razims

razims May 23, 2018

Author Contributor

agree, but i would keep files locally anyway

This comment has been minimized.

@razims

razims May 23, 2018

Author Contributor

We can have a flag, keepLocal=true/false

This comment has been minimized.

@SebastianStehle

SebastianStehle May 23, 2018

Contributor

Thats fine for me too.

This comment has been minimized.

@razims

razims May 23, 2018

Author Contributor

what about the case when we have set keepLocal=false earlier and then call GenerateSourceUrl to get "static" url? Create locally, then serve?


using (var stream = new MemoryStream())
{
await bucket.DownloadToStreamByNameAsync(file.Name, stream, cancellationToken: ct);

This comment has been minimized.

@SebastianStehle

SebastianStehle May 23, 2018

Contributor

Can we not just rename the asset?

This comment has been minimized.

@razims

razims May 23, 2018

Author Contributor

no support from IGridFSBucket, unfortunately :(

This comment has been minimized.

@SebastianStehle

SebastianStehle May 23, 2018

Contributor

Or at least a version where you do not keep everything in memory.

This comment has been minimized.

@razims

razims May 23, 2018

Author Contributor

Rename != Copy

This comment has been minimized.

@razims

razims May 23, 2018

Author Contributor

no :) , copy means you have two copies at the end

This comment has been minimized.

@razims

razims May 23, 2018

Author Contributor

it would make sense to use Rename if you'll rename IAssetStore.CopyAsync to IAssetStore.RenameAsync. Or just add new method IAssetStore.RenameAsync

This comment has been minimized.

@SebastianStehle

This comment has been minimized.

@razims

razims May 23, 2018

Author Contributor

umbenennen auf Deutsch ist nicht unbedingt gleich wie kopieren, oder? :)

This comment has been minimized.

@SebastianStehle

SebastianStehle May 23, 2018

Contributor

Yes, as I said you are right. I was thinking too much about the use case here, where it does not really matter if it is copy or rename because the source will be deleted anyway ;)

@razims

This comment has been minimized.

Copy link
Contributor Author

razims commented May 23, 2018

One more comment. Since you use Autofac, i see no reason to use Guard.NotNull for injected objects. Autofac has built in check for this. It's better to use JetBrains.Annotation like [NotNull] and if you use Resharper you'll get notified during coding.

@SebastianStehle

This comment has been minimized.

Copy link
Contributor

SebastianStehle commented May 23, 2018

I do not use Autofac. I like to keep the a class independent of the infrastructure, so that I get proper exceptions. I do not use Resharper anymore because it was too annoying to keep the code styles and warnings consistent across the environments. Much better with Roslyn now :)

fix delete order
- local file should be removed first
using (var cursor = await bucket.FindAsync(filter))
file.Delete();
using (var cursor = await bucket.FindAsync(Builders<GridFSFileInfo>.Filter.And(
Builders<GridFSFileInfo>.Filter.Eq(x => x.Filename, file.Name)

This comment has been minimized.

@SebastianStehle

SebastianStehle May 23, 2018

Contributor

I still do not see why you have to loop. Can you not use use: bucket.DeleteAsync(file.Name)?

This comment has been minimized.

@razims

razims May 23, 2018

Author Contributor

Unfortunately no support from IGridFSBucket to delete fine by name currently.

This comment has been minimized.

@SebastianStehle

SebastianStehle May 23, 2018

Contributor

Is the name not the same like Id?

This comment has been minimized.

@razims

razims May 23, 2018

Author Contributor

no, as far as I understood it is ObjectId

This comment has been minimized.

This comment has been minimized.

@razims

razims May 23, 2018

Author Contributor

then Id will be equal Filename

This comment has been minimized.

@SebastianStehle

This comment has been minimized.

@razims

razims May 23, 2018

Author Contributor

it will also overcomplicate bucket.UploadFromStreamAsync(file.Name, stream, cancellationToken: ct);

This comment has been minimized.

@SebastianStehle

SebastianStehle May 23, 2018

Contributor

Perhaps I have another version, but for me it looks like

bucket.UploadFromStreamAsync(file.Name, file.Name, stream, ...);

https://mongodb.github.io/mongo-csharp-driver/2.4/apidocs/html/M_MongoDB_Driver_GridFS_GridFSBucket_1_UploadFromStreamAsync.htm

This comment has been minimized.

@razims

razims May 23, 2018

Author Contributor

apparently i missed it, fixed, pushed

@razims

This comment has been minimized.

Copy link
Contributor Author

razims commented May 23, 2018

failed - no space left on device :(

@SebastianStehle

This comment has been minimized.

Copy link
Contributor

SebastianStehle commented May 23, 2018

Yes, no problem. The next commit will work again ;)

using (var stream = new MemoryStream())
{
await bucket.DownloadToStreamByNameAsync(file.Name, stream, cancellationToken: ct);
await bucket.UploadFromStreamAsync(file.Name, stream, cancellationToken: ct);
await bucket.DownloadToStreamAsync(file.Name, stream, cancellationToken: ct);

This comment has been minimized.

@SebastianStehle

SebastianStehle May 23, 2018

Contributor

Can be improved with

using (var readStream = await bucket.GetDownloadStreamAsync(file))
{
   using (var writeStream = await bucket.GetUploadStreamAsync(file.Name))
   {	
   	var buffer = new byte[4096]; // Should be equivalent to options.BlockSize
   	var bytesRead = 0;
   	
   	while ((bytesRead = await readStream.ReadAsync(buffer, 0, buffer.Length)) 
   	{
   		await writeStream.WriteAsync(buffer, 0, bytesRead);
   	}
   }
}

This comment has been minimized.

@razims

razims May 23, 2018

Author Contributor

yes, right

This comment has been minimized.

@razims

razims May 23, 2018

Author Contributor

but where you found GetDownloadStreamAsync?

OpenDownloadStreamAsync maybe ?

This comment has been minimized.

@SebastianStehle

SebastianStehle May 23, 2018

Contributor

sry, OpenDownloadStreamAsync

This comment has been minimized.

@razims

razims May 23, 2018

Author Contributor

where you took options.BlockSize from ?

This comment has been minimized.

@SebastianStehle

SebastianStehle May 23, 2018

Contributor

I mean ChunkSizeBytes

razims added some commits May 23, 2018

fix
@Jaben

This comment has been minimized.

Copy link

Jaben commented Jul 11, 2018

This isn't going to be merged into the project @SebastianStehle ?

@SebastianStehle

This comment has been minimized.

Copy link
Contributor

SebastianStehle commented Jul 12, 2018

I thought the development has kind of stopped because I had the impression that some of the discussions were still open / not resolved.

@razims

This comment has been minimized.

Copy link
Contributor Author

razims commented Jul 13, 2018

actually not, we're already using it internally works great. I thought you were not interested 😀 to merge it. Anyway please clarify, what is required and what you don't like. I'll fix/push in next days.

@SebastianStehle

This comment has been minimized.

Copy link
Contributor

SebastianStehle commented Jul 13, 2018

I just had a look to the code again and I am not sure If I remember everything about the local file handling. In a multi-node environment the files do not always exist, but the error handling does not reflect it.

@razims

This comment has been minimized.

Copy link
Contributor Author

razims commented Jul 13, 2018

@SebastianStehle if you mean storing the copies of images locally, initially i thought it may increase the throughput and response time. But then i discovered that it adds additional "cache" level and increases complexity. It should generally be the task of consumer, whether he/she wants to cache the result or not. I'll remove the local file handling.

@SebastianStehle

This comment has been minimized.

Copy link
Contributor

SebastianStehle commented Jul 13, 2018

That would be great. If requested I would implement a cache for ALL storage providers.

Furthermore:

  1. Please move the class to the Infrastructure.MongoDb project
  2. Please to not swallow exceptions in Initialize()
@razims

This comment has been minimized.

Copy link
Contributor Author

razims commented Jul 13, 2018

@SebastianStehle

This comment has been minimized.

Copy link
Contributor

SebastianStehle commented Jul 13, 2018

Btw: I hope you know the new forum: https://support.squidex.io/

@razims

This comment has been minimized.

Copy link
Contributor Author

razims commented Jul 13, 2018

@SebastianStehle yep, you invited once

@Jaben

This comment has been minimized.

Copy link

Jaben commented Jul 13, 2018

I'm using it and quite happy with it so far... nice work @razims . One thing I noticed doesn't work is the backup -- assets are all empty.

@SebastianStehle

This comment has been minimized.

Copy link
Contributor

SebastianStehle commented Jul 15, 2018

Could also be the backup system. I will test it when I merge it.

@SebastianStehle SebastianStehle merged commit 8913e0c into Squidex:master Jul 17, 2018

1 check passed

continuous-integration/drone/pr the build was successful
Details
@razims

This comment has been minimized.

Copy link
Contributor Author

razims commented Jul 17, 2018

you didn't wait for me 😀

@SebastianStehle

This comment has been minimized.

Copy link
Contributor

SebastianStehle commented Jul 17, 2018

I misunderstood you and thought it is already done. When I merged it locally I realized that I can just finish within half an hour and just completed it.

@razims

This comment has been minimized.

Copy link
Contributor Author

razims commented Jul 17, 2018

okay, fine 👍🏻

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment