Skip to content

Commit

Permalink
Add extended statistics CSV export
Browse files Browse the repository at this point in the history
This patch brings statistics CSV exports to the external API.

Using the new endpoint, it is possible to export event/series statistics
along with all available meta data to CSV.

Pagination and filtering by meta data fields is supported.

This work is sponsored by ETH.
  • Loading branch information
Kim Rinnewitz committed Oct 2, 2019
1 parent 77cd59d commit 2c43271
Show file tree
Hide file tree
Showing 16 changed files with 746 additions and 28 deletions.
62 changes: 62 additions & 0 deletions docs/guides/developer/docs/api/statistics-api.md
Expand Up @@ -258,3 +258,65 @@ Field name |Type | Description
`values` | [`array[integer]`](#extended) | The values of the measurement points
`total` | [`integer`](#basic) | The sum of all values

### POST /api/statistics/data/export.csv

Retrieves statistical data in csv format.

Form Parameters | Required |Type | Description
:---------------|:---------|:-------------------------------------|:-----------
`data` | yes | [`array[object]`](types.md#extended) | A JSON object describing the statistics query to request (see below)
`filter` | no | [`string`](types.md#basic) | A comma-separated list of filters to limit the results with (see [Filtering](usage.md#filtering)). All standard dublin core meta data fields are filterable.
`limit` | no | [`integer`](types.md#basic) | The maximum number of resources to return (see [Pagination](usage.md#pagination))
`offset` | no | [`integer`](types.md#basic) | The index of the first resource to return (see [Pagination](usage.md#pagination))

Note that limit and offset relate to the resource here, not CSV lines. There can be multiple lines in a CSV for a resource,
e.g. an event. However, you cannot limit by lines, but only by e.g. events.

A query JSON object contains information about a statistics query to be executed:

Field | Required | Type | Description
:------------|:---------|:-------------------------|:-----------
`provider` | yes | [`property`](#extended) | A JSON object with information about the statistics provider to be queried
`parameters` | yes | [`property`](#extended) | A JSON object containing the parameters

Here, a statistics provider JSON object has the following fields:

Field | Type | Description
:--------------|:------------------------------------------|:-----------
`identifier` | [`string`](types.md#basic) | The unique identifier of the provider
`resourceType` | [`string`](types.md#basic) | The resource type of the provider

There parameters are the same as described [above](#time-series-statistics-provider), but with one additional field:

Field name |Type | Description
:----------------|:-------------------------------------|:-----------
`detailLevel` | [`string`](#basic) | `EPISODE`, `SERIES`, or `ORGANIZATION` (only available for CSV exports)

__Example__

data:
```
{
"parameters": {
"resourceId": "mh_default_org",
"detailLevel": "EPISODE",
"from": "2018-12-31T23:00:00.000Z",
"to": "2019-12-31T22:59:59.999Z",
"dataResolution": "YEARLY"
},
"provider": {
"identifier": "organization.views.sum.influx",
"resourceType": "organization"
}
}
```
filter:
```
presenters:Hans Dampf
```

__Response__

`200 (OK)`: A (potentially empty) CSV file containing the resource statistics with all available meta data
`400 (BAD REQUEST)`: The request was not valid

Expand Up @@ -7,3 +7,23 @@
#
# Default: not set
#series.to.event.provider.mappings=

# A comma separated list of mappings which map from organization provider IDs to episode provider IDs. The first part of
# each mapping has to be an organization provider ID. A colon marks the beginning of the second part of the mapping which
# has to be a corresponding episode provider ID. If these mappings are defined, then, when exporting organization statistics
# data, the data of all events of the organization will be exported, instead of the organization data. If you don't
# provide a mapping for a certain provider, you will only be able to export the lower detailed organization data.
# Example: organization.to.event.provider.mappings=organization.views.sum.influx:episode.views.sum.influx,foo:bar
#
# Default: not set
#organization.to.event.provider.mappings=

# A comma separated list of mappings which map from organization provider IDs to series provider IDs. The first part of
# each mapping has to be an organization provider ID. A colon marks the beginning of the second part of the mapping which
# has to be a corresponding series provider ID. If these mappings are defined, then, when exporting organization statistics
# data, the data of all series of the organization will be exported, instead of the organization data. If you don't
# provide a mapping for a certain provider, you will only be able to export the lower detailed organization data.
# Example: organization.to.series.provider.mappings=organization.views.sum.influx:series.views.sum.influx,foo:bar
#
# Default: not set
#organization.to.series.provider.mappings=
1 change: 1 addition & 0 deletions etc/security/mh_default_org.xml
Expand Up @@ -216,6 +216,7 @@
<sec:intercept-url pattern="/api/series" method="POST" access="ROLE_ADMIN, ROLE_API_SERIES_CREATE"/>
<sec:intercept-url pattern="/api/security/sign" method="POST" access="ROLE_ADMIN, ROLE_API_SECURITY_EDIT"/>
<sec:intercept-url pattern="/api/statistics/data/query" method="POST" access="ROLE_ADMIN, ROLE_API_STATISTICS_VIEW"/>
<sec:intercept-url pattern="/api/statistics/data/export.csv" method="POST" access="ROLE_ADMIN, ROLE_API_STATISTICS_VIEW"/>
<sec:intercept-url pattern="/api/workflows" method="POST" access="ROLE_ADMIN, ROLE_API_WORKFLOW_INSTANCE_CREATE"/>
<!-- External API DELETE Endpoints -->
<sec:intercept-url pattern="/api/events/*" method="DELETE" access="ROLE_ADMIN, ROLE_API_EVENTS_DELETE"/>
Expand Down
1 change: 1 addition & 0 deletions etc/security/security_sample_ldap.xml-example
Expand Up @@ -204,6 +204,7 @@
<sec:intercept-url pattern="/api/series" method="POST" access="ROLE_ADMIN, ROLE_API_SERIES_CREATE"/>
<sec:intercept-url pattern="/api/security/sign" method="POST" access="ROLE_ADMIN, ROLE_API_SECURITY_EDIT"/>
<sec:intercept-url pattern="/api/statistics/data/query" method="POST" access="ROLE_ADMIN, ROLE_API_STATISTICS_VIEW"/>
<sec:intercept-url pattern="/api/statistics/data/export.csv" method="POST" access="ROLE_ADMIN, ROLE_API_STATISTICS_VIEW"/>
<!-- External API DELETE Endpoints -->
<sec:intercept-url pattern="/api/events/*" method="DELETE" access="ROLE_ADMIN, ROLE_API_EVENTS_DELETE"/>
<sec:intercept-url pattern="/api/events/*/acl/*/*" method="DELETE" access="ROLE_ADMIN, ROLE_API_EVENTS_ACL_DELETE"/>
Expand Down
6 changes: 6 additions & 0 deletions modules/external-api/pom.xml
Expand Up @@ -74,6 +74,12 @@
<version>${project.version}</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>org.opencastproject</groupId>
<artifactId>opencast-statistics-export-service</artifactId>
<version>${project.version}</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>org.opencastproject</groupId>
<artifactId>opencast-workflow-service-api</artifactId>
Expand Down
Expand Up @@ -618,7 +618,7 @@ public Response getEvents(@HeaderParam("Accept") String acceptHeader, @QueryPara
String value;

if (!requestedVersion.isSmallerThan(ApiVersion.VERSION_1_1_0)) {
// MH-13038 - 1.1.0 and higher support semi-colons in values
// MH-13038 - 1.1.0 and higher support colons in values
value = f.substring(name.length() + 1);
} else {
value = filterTuple[1];
Expand Down
Expand Up @@ -41,6 +41,8 @@
import org.opencastproject.statistics.api.ResourceType;
import org.opencastproject.statistics.api.StatisticsProvider;
import org.opencastproject.statistics.api.StatisticsService;
import org.opencastproject.statistics.export.api.StatisticsExportService;
import org.opencastproject.util.NotFoundException;
import org.opencastproject.util.RestUtil;
import org.opencastproject.util.doc.rest.RestParameter;
import org.opencastproject.util.doc.rest.RestQuery;
Expand All @@ -51,13 +53,19 @@

import org.apache.commons.lang3.StringUtils;
import org.json.simple.JSONArray;
import org.json.simple.JSONObject;
import org.osgi.service.component.ComponentContext;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

import java.time.ZoneId;
import java.util.Arrays;
import java.util.Collections;
import java.util.List;
import java.util.Map;
import java.util.Optional;
import java.util.Set;
import java.util.stream.Collectors;

import javax.servlet.http.HttpServletResponse;
import javax.ws.rs.FormParam;
Expand All @@ -84,6 +92,7 @@ public class StatisticsEndpoint {
private IndexService indexService;
private ExternalIndex externalIndex;
private StatisticsService statisticsService;
private StatisticsExportService statisticsExportService;

public void setSecurityService(SecurityService securityService) {
this.securityService = securityService;
Expand All @@ -101,6 +110,10 @@ public void setStatisticsService(StatisticsService statisticsService) {
this.statisticsService = statisticsService;
}

public void setStatisticsExportService(StatisticsExportService statisticsExportService) {
this.statisticsExportService = statisticsExportService;
}

/** OSGi activation method */
void activate(ComponentContext cc) {
logger.info("Activating External API - Statistics Endpoint");
Expand Down Expand Up @@ -247,6 +260,81 @@ public Response getStatistics(@HeaderParam("Accept") String acceptHeader, @FormP
return ApiResponses.Json.ok(acceptHeader, result.toJSONString());
}

@POST
@Produces({ ApiMediaType.JSON, ApiMediaType.VERSION_1_4_0 })
@Path("data/export.csv")
@RestQuery(
name = "getexportcsv",
description = "Returns a statistics csv export",
returnDescription = "The requested statistics csv export",
restParameters = {
@RestParameter(
name = "data", description = "An JSON object describing the query to be executed",
isRequired = true, type = RestParameter.Type.TEXT),
@RestParameter(
name = "limit", description = "Limit for pagination.",
isRequired = false, type = RestParameter.Type.INTEGER),
@RestParameter(
name = "offset", description = "Offset for pagination.",
isRequired = false, type = RestParameter.Type.INTEGER),
@RestParameter(
name = "filter", description = "A comma seperated list of filters to limit the results with. A filter is the filter's name followed by a colon \":\" and then the value to filter with so it is the form <Filter Name>:<Value to Filter With>.",
isRequired = false, type = RestParameter.Type.STRING)
},
reponses = {
@RestResponse(
description = "Returns the csv data as requested by the query as plain text",
responseCode = HttpServletResponse.SC_OK),
@RestResponse(
description = "If the current user is not authorized to perform this action",
responseCode = HttpServletResponse.SC_UNAUTHORIZED)
})
public Response getExportCSV(
@HeaderParam("Accept") String acceptHeader,
@FormParam("data") String data,
@FormParam("limit") Integer limit,
@FormParam("offset") Integer offset,
@FormParam("filter") String filter
) throws NotFoundException, SearchIndexException, UnauthorizedException {

final int lim = limit != null ? Math.max(0, limit) : 0;
final int off = offset != null ? Math.max(0, offset) : 0;

final Map<String, String> filters = Arrays.stream(Optional.ofNullable(filter).orElse("")
.split(","))
.filter(f -> f.contains(":"))
.collect(Collectors.toMap(
f -> f.substring(0, f.indexOf(":")),
f -> f.substring(f.indexOf(":") + 1)));

QueryUtils.Query query = null;
try {
query = QueryUtils.parseQuery(data, statisticsService);
} catch (Exception e) {
logger.debug("Unable to parse form parameter 'data' {}, exception: {}", data, e);
return RestUtil.R.badRequest("Unable to parse form parameter 'data': " + e.getMessage());
}
checkAccess(query.getParameters().getResourceId(), query.getProvider().getResourceType());

final QueryUtils.ExportParameters parameters = (QueryUtils.ExportParameters) query.getParameters();
final String result = statisticsExportService.getCSV(
query.getProvider(),
parameters.getResourceId(),
parameters.getFrom(),
parameters.getTo(),
parameters.getDataResolution(),
this.externalIndex,
ZoneId.systemDefault(),
true,
parameters.getDetailLevel(),
lim,
off,
filters
);

return ApiResponses.Json.ok(acceptHeader, new JSONObject(Collections.singletonMap("csv", result)).toJSONString());
}

private void checkAccess(final String resourceId, final ResourceType resourceType) {
try {
switch (resourceType) {
Expand Down
@@ -0,0 +1,55 @@
/**
* Licensed to The Apereo Foundation under one or more contributor license
* agreements. See the NOTICE file distributed with this work for additional
* information regarding copyright ownership.
*
*
* The Apereo Foundation licenses this file to you under the Educational
* Community License, Version 2.0 (the "License"); you may not use this file
* except in compliance with the License. You may obtain a copy of the License
* at:
*
* http://opensource.org/licenses/ecl2.txt
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
* WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
* License for the specific language governing permissions and limitations under
* the License.
*
*/
package org.opencastproject.external.util.statistics;

import org.opencastproject.statistics.export.api.DetailLevel;

import org.json.simple.JSONArray;

import java.util.Optional;
import java.util.Set;

public final class DetailLevelUtils {

private DetailLevelUtils() {
}

public static JSONArray toJson(Set<DetailLevel> detailLevels) {
JSONArray result = new JSONArray();
for (DetailLevel resolution : detailLevels) {
result.add(toString(resolution));
}
return result;
}

public static String toString(DetailLevel detailLevel) {
return detailLevel.toString().toLowerCase();
}

public static Optional<DetailLevel> fromString(String detailLevel) {
try {
return Optional.of(Enum.valueOf(DetailLevel.class, detailLevel.toUpperCase()));
} catch (IllegalArgumentException e) {
return Optional.empty();
}
}

}
Expand Up @@ -25,6 +25,7 @@
import org.opencastproject.statistics.api.StatisticsService;
import org.opencastproject.statistics.api.TimeSeries;
import org.opencastproject.statistics.api.TimeSeriesProvider;
import org.opencastproject.statistics.export.api.DetailLevel;

import org.apache.commons.lang3.StringUtils;
import org.json.simple.JSONArray;
Expand Down Expand Up @@ -180,6 +181,27 @@ void validate() {
}
}

public static class ExportParameters extends TimeSeriesParameters {

private DetailLevel detailLevel;

ExportParameters(String resourceId, JSONObject raw) {
super(resourceId, raw);
}

void setDetailLevel(String detailLevel) {
Optional<DetailLevel> level = DetailLevelUtils.fromString(detailLevel);
if (!level.isPresent()) {
throw new IllegalArgumentException("Illegal value for 'detailLevel'");
}
this.detailLevel = level.get();
}

public DetailLevel getDetailLevel() {
return this.detailLevel;
}
}

public static List<Query> parse(String queryString, StatisticsService statisticsService) {

if (StringUtils.isBlank(queryString)) {
Expand All @@ -204,6 +226,22 @@ public static List<Query> parse(String queryString, StatisticsService statistics
return queries;
}

public static Query parseQuery(String queryString, StatisticsService statisticsService) {
if (StringUtils.isBlank(queryString)) {
throw new IllegalArgumentException("No query data provided");
}

JSONParser parser = new JSONParser();
JSONObject queryJson;
try {
queryJson = (JSONObject) parser.parse(queryString);
} catch (ParseException e) {
throw new IllegalArgumentException("JSON malformed");
}
return parseQuery(queryJson, statisticsService);

}

private static Query parseQuery(JSONObject queryJson, StatisticsService statisticsService) {

// Get the mandatory provider identifier
Expand All @@ -230,7 +268,13 @@ public static Parameters parseParameters(JSONObject parametersJson, StatisticsPr

// The other parameters are specific to statistics provider implementations
if (provider instanceof TimeSeriesProvider) {
TimeSeriesParameters p = new TimeSeriesParameters(resourceId, parametersJson);
TimeSeriesParameters p;
if (parametersJson.containsKey("detailLevel")) {
p = new ExportParameters(resourceId, parametersJson);
((ExportParameters)p).setDetailLevel(getField(parametersJson, "detailLevel", "Parameter 'detailLevel' is misisng"));
} else {
p = new TimeSeriesParameters(resourceId, parametersJson);
}
p.setFrom(getField(parametersJson, "from", "Parameter 'from' is missing"));
p.setTo(getField(parametersJson, "to", "Parameter 'to' is missing"));
p.setDataResolution(getField(parametersJson, "dataResolution", "Parameter 'dataResolution' is missing"));
Expand Down

0 comments on commit 2c43271

Please sign in to comment.