Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

adapter: DataImports API and refactoring #322

Merged
merged 5 commits into from
Feb 25, 2021

Conversation

lunedis
Copy link
Contributor

@lunedis lunedis commented Feb 15, 2021

closes #135

I have a very bad habit of not commiting while I work, so sorry in advance for this mess of a PR consisting of 1 huge commit.

This PR contains the following:

  • Introduction of the concept of a "Data Import", which is the result of calling /trigger on a datasource. DataImports are saved in the database and can be accessed using new API endpoints (see README and related issue).
  • Big exception related refactoring: The internal classes (Adapter, DatasourceManager, ...) throw only application-specific exceptions now. Those exceptions are translated into RestResponseExceptions with suitable status code in the endpoint classes. I got a bit carried away and thus this is done using a Map of exceptions to status codes and a fancy method that takes a custom lambda. Check the code review for a caveat.
  • Some cleanup of imports related to using lombok for e.g. Equals and Hashcode methods.
  • Updated readme, unit tests and integration tests for all of the above.

@lunedis lunedis self-assigned this Feb 15, 2021
@CLAassistant
Copy link

CLAassistant commented Feb 15, 2021

CLA assistant check
All committers have signed the CLA.

throw new ResponseStatusException(HttpStatus.NOT_FOUND, "Datasource needs to exist before updating", e);
}
public void updateDatasource(@PathVariable Long id, @Valid @RequestBody Datasource updateConfig) {
this.handleErrors(() -> { datasourceManager.updateDatasource(id, updateConfig); return null;});
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the return null; is necessary as the handleErrors function expects a CheckedSupplier which returns something.

It would be an option to overload that function expecting a (to be defined) CheckedRunnable that returns void, but that would mean more code duplication. It could have been so pretty 😢 ...

Copy link
Contributor

@sonallux sonallux left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work introducing application specific exceptions. I really like that, it is much cleaner now!

Instead of using our own handleErrors(...) function, you could also have used these fancy Spring annotations: @ExceptionHandler and @ControllerAdvice. Here and here you can find some tutorials how to use these annotations if you want to go with the spring way of handling exceptions. This is just for reference, for me, it is also ok sticking with the handleErrors(...) function.

public class Mappings {
public static final String IMPORT_PATH = "/preview";
public static final String RAW_IMPORT_PATH = "/preview/raw";
public static final String FORMAT_PATH = "/formats";
public static final String PROTOCOL_PATH = "/protocols";
public static final String VERSION_PATH = "/version";

public static final Map<Class, HttpStatus> ERROR_MAPPING = Map.of(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if a status code annotation directly at the exception is better?

@ResponseStatus(code = HttpStatus.BAD_REQUEST)
class CustomException extends RuntimeException {}

Probably you had some thoughts on that, do you mind sharing them? :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if a status code annotation directly at the exception is better?

Probably you had some thoughts on that, do you mind sharing them? :)

I honestly had no clue about the way you are "supposed" to do it in Spring. Gonna have to take a look at it in detail. The solution I went for was the first one that came to mind to avoid repetition in every endpoint method.


import lombok.NoArgsConstructor;

@NoArgsConstructor
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would have assumed that lombok has some help for exceptions, but seems like the PR is not yet merged: projectlombok/lombok#2702

Datasource existing = datasourceRepository.findById(id)
.orElseThrow(() -> new IllegalArgumentException("Datasource with id " + id + " not found"));
public void updateDatasource(Long id, Datasource update) throws DatasourceNotFoundException {
Datasource existing = datasourceRepository.findById(id).orElseThrow(() -> new DatasourceNotFoundException(id));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think these chains are hard to read.. I guess we should introduce some linter or so in the future xD

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually kinda like chains like that, guess that comes from my preference for functional programming 😁.

@@ -20,52 +22,48 @@
public class DatasourceEndpoint {
private final DatasourceManager datasourceManager;

private <T> T handleErrors(CheckedSupplier<T> function) throws ResponseStatusException {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

duplicate with other endpoints => should we use a convert method and place it next to the mapping?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are two different Mappings classes though, one for the Adapter, and one for the Datasources.

I have not found a good way to avoid this duplication yet. Someone got any ideas?


private Date timestamp;

@ManyToOne(fetch = FetchType.EAGER)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That means the datasource is always also retrieved, right? In general, not a bad idea, but why do we not serialize it then? Just a little confused, do we need it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need its id to create the URI to the data in DataImportMetadata.

Also the annotations are necessary for Spring to create the database tables and fields properly I think.

@OneToMany(cascade = CascadeType.ALL, mappedBy = "datasource")
@JsonIgnore
@EqualsAndHashCode.Exclude // needed to avoid an endless loop because of a circular reference
private Set<DataImport> dataImports;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When are they fetched? Always on a get? Might lead to a huge overload once we had run 100 imports for a source... Maybe we should not link these concepts at all and just provide a datasourceId in the DataImport?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I put in a fetch = FetchType.LAZY which should resolve your worries about performance.

Maybe we should not link these concepts at all and just provide a datasourceId in the DataImport?

Usually I prefer things doing manually as well, I am not sure if all the Spring SQL magic still works then though. And to be honest, the foreign keys generated by using those annotations are kinda neat.

Copy link
Contributor

@sonallux sonallux Feb 16, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we do not want this field, we can safely remove it, because the datasource field in the DataImport class is still annotated with @ManyToOne. This should be sufficient for Hibernate the generate the foreign keys.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I use that field to generate the result of the /datasources/{id}/imports call though. But of course that could also be done using a function in the repository.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I would definitely go with a separate findAllByDatasourceId function in the DataImportRespository. This will be much faster because querying all other properties of the Datasource is useless when just return the list of DataImports.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Problem I see with this approach:

We want to differentiate between "return empty array because no data imports were found for the data source" and "return 404 because datasource does not exist".

If we use a function called findAllByDatasourceId it would just return nothing if the datasource does not exist at all.
So we kinda need to fetch the datasource details anyways (or check if it exists, which should not be a big difference).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I haven't considered that 🙈

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point.. I am kind of unsure what is best...

I guess lazy loading is misleading when reading code - or at least I'd assume the relation is fetched since I don't work with Spring too often. So personally, I'd not have the reference and make a separate call to prevent these uncertainties from happening. The readability would be clear:

  • if datasource does not exist => return 404
  • if imports do not exist => return []

The obvious disadvantage: two database requests instead of one..

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The obvious disadvantage: two database requests instead of one..

Well, I think having @OneToMany with FetchType.LAZY will also be two database requests...

@lunedis
Copy link
Contributor Author

lunedis commented Feb 24, 2021

I updated the PR and changed the exception handling to be more spring-y (?, springful? springesque?).

@sonallux could you please take a look?

Also that was a very short endeavor into the functional side of Java... 😢

georg-schwarz
georg-schwarz previously approved these changes Feb 24, 2021
Copy link
Contributor

@sonallux sonallux left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just two very minor things, otherwise LGTM. The exception handling is now much more springish 👍

This PR also fixes #232 🚀

@georg-schwarz
Copy link
Member

Nice! I think this feature is a good step forward :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Adapter: Get imports and imported data of datasource via API
4 participants