New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect media path formatting #245

Closed
mingard opened this Issue Apr 24, 2017 · 14 comments

Comments

Projects
None yet
3 participants
@mingard
Copy link
Member

mingard commented Apr 24, 2017

See:

return `${config.get('server.protocol')}://${config.get('server.host')}:${config.get('server.port')}${folderPath}/${fileName}`

Path is using server.host. It should not be using a local value and should instead use a new variable held in media block.

@eduardoboucas

This comment has been minimized.

Copy link
Member

eduardoboucas commented Apr 24, 2017

Why shouldn't it use server.host?

@mingard

This comment has been minimized.

Copy link
Member Author

mingard commented Apr 24, 2017

Because server.host can be either local (127.0.0.1, 0.0.0.0 etc) or a hostname. It isn't the hostname of the api instance

@eduardoboucas

This comment has been minimized.

Copy link
Member

eduardoboucas commented Apr 24, 2017

Right. Should we create a more global configuration parameter (like server.baseUrl), as this isn't something specific to media and could be useful elsewhere? We've been using something similar in Web on client projects, although there isn't an official configuration parameter for it.

@mingard

This comment has been minimized.

Copy link
Member Author

mingard commented Apr 24, 2017

@eduardoboucas I think it should be removed entirely. If you're making a request to {protocol}://{host}:{port}/{version}/{database}/images/ then what's the point of passing back a full path?

Is there a reason we're not storing in S3? That must be a requirement for CDN to work no?

It would be useful to pass back a storage method. so that Publish can decide whether to try and load directly from API or from config.cdn in Publish.

@eduardoboucas

This comment has been minimized.

Copy link
Member

eduardoboucas commented Apr 24, 2017

I think it should be removed entirely. If you're making a request to {protocol}://{host}:{port}/{version}/{database}/images/ then what's the point of passing back a full path?

Not sure I get what you mean. When you upload an image, you'll need to know which URL you'll be able to access it on. This will vary depending on the storage method selected, which happens at API level.

Is there a reason we're not storing in S3? That must be a requirement for CDN to work no?

No, CDN doesn't require S3. It was also a requirement that API didn't depend on S3 (see comments here), and that Publish doesn't require CDN. Since version 1.16.0, API is capable of handling the upload and delivery of media assets by itself.

It would be useful to pass back a storage method. so that Publish can decide whether to try and load directly from API or from config.cdn in Publish.

I don't see why we'd need this. If API gives you the full URL of where the image is, you can then decide to load it directly or to put CDN in front of it — i.e. if it gives you http://foo.com/bar.jpg, you can use that directly or use http://yourcdn.com/http://foo.com/bar.jpg?width=600&height=300.

@mingard

This comment has been minimized.

Copy link
Member Author

mingard commented Apr 24, 2017

@eduardoboucas

No, CDN doesn't require S3. It was also a requirement that API didn't depend on S3

So what if we want to use S3, as we probably will?

Not sure I get what you mean. When you upload an image, you'll need to know which URL you'll be able to access it on.

In that case we should store the hostname for media in the media block. That way if the storage method is S3 instead of local, we can return the specified hostname. e.g.

"media": {
  "enabled": true,
  "storage": "disk",
  "basePath": "workspace/media",
  "pathFormat": "date",
  "host": "http://api-domain.com" 
}
"media": {
  "enabled": true,
  "storage": "s3",
  "basePath": "s3/path/to/media",
  "pathFormat": "date",
  "host": "https://s3-hostname.com" ,
  "s3": {
      "accessKey": "xxxx",
      "secretKey": "xxxx",
      "bucketName": "publish-data",
      "region": "eu-west-1"
    }
}

you can use that directly or use http://yourcdn.com/http://foo.com/bar.jpg?width=600&height=300

Not that efficient. Rather than CDN loading from S3 and manipulating, we'd be loading across the internet.

@eduardoboucas

This comment has been minimized.

Copy link
Member

eduardoboucas commented Apr 24, 2017

So what if we want to use S3, as we probably will?

That's fine, S3 is one of the storage methods supported by API, my point was that it isn't a requirement. Whether or not the S3 adapter is up-to-date with the latest changes to how API handles media is something we'll need to ask @jimlambie, as I haven't worked on that.

As for the rest of your comment re: adding base URL to the media block, sounds reasonable. Again, I'm not too familiar with the S3 implementation, so @jimlambie will be better positioned to comment on that.

@mingard

This comment has been minimized.

Copy link
Member Author

mingard commented Apr 24, 2017

Okay. @jimlambie over to you!

@jimlambie

This comment has been minimized.

Copy link
Member

jimlambie commented Apr 25, 2017

Right... ok, so S3 is returning an incorrect path. It's returning the basePath + the folders + filename.

We can change it to return the URL provided after the AWS putObject operation. @eduardoboucas we talked about dropping awsUrl, and replacing with path as for the file storage.

So:

{
  "results": [
    {
      "fileName": "Slack_Mark_Web.png",
      "mimetype": "image/png",
      "width": 275,
      "height": 275,
      "contentLength": 3510,
      "path": "https://lambie.s3.amazonaws.com/workspace/media/vjoin/testdb/images/2017/04/25/Slack_Mark_Web.png",
      "apiVersion": "vjoin",
      "createdAt": 1493108080933,
      "createdBy": "testClient",
      "history": [],
      "v": 1,
      "_id": "58ff057271f2aede4f3e1186"
    }
  ]
}
@jimlambie

This comment has been minimized.

Copy link
Member

jimlambie commented Apr 25, 2017

@eduardoboucas @mingard I think I'll add a baseUrl property to server in config. Seems like it might get some use somewhere else.

This is what the disk storage module would look like when preparing an asset's full path:

  if (serverConfig.baseUrl) {
    return `${serverConfig.protocol}://${serverConfig.baseUrl}${folderPath}/${fileName}`
  } else {
    return `${serverConfig.protocol}://${serverConfig.host}:${serverConfig.port}${folderPath}/${fileName}`
  }
@mingard

This comment has been minimized.

Copy link
Member Author

mingard commented Apr 25, 2017

@jimlambie first condition looks good, but the else clause is a default that is likely to cause issues. Unless API is hosted on the same instance as Publish, it'll never work

@mingard

This comment has been minimized.

Copy link
Member Author

mingard commented Apr 25, 2017

Also, using the config protocol value means you'd need the API instance to be running on the same protocol as your image repository.

@eduardoboucas

This comment has been minimized.

Copy link
Member

eduardoboucas commented Apr 28, 2017

Should we take baseUrl out of server block and make that a block of its own? Following the convention used in server, we could have:

"baseUrl": {
  "protocol": "http",
  "port": 80,
  "host": "mydomain.com"
}
@eduardoboucas

This comment has been minimized.

Copy link
Member

eduardoboucas commented Apr 28, 2017

Also, should the endpoints be /media (what is currently implemented) or /api/media (what was on the original spec created by @jimlambie). We seem to use the /api namespace for retrieving and setting metadata (config, collections), whilst media is closer to an actual collection — it just doesn't have version and database, so if we take those out /media seems to make sense.

In any case, it's an easy change to make so happy to go either way.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment