# Web Serivces

## Topics
- What are the web serivces?
- Demo for setting up minimal web API using ASP.NET
- Deeper look into ASP.NET
- Controllers in ASP.NET
- Different types of web services


## What are the Web Services

In short: web based application that returns information that is *typically* intended to be consumed by another application.

However, for the sake of academic purposes it is worth diving deeper into the subject.

### What is Web

Web is an ecosystem of standards and technologies mainly based on HTTP(S) protocol.

Web is based on client-server architecture[^1]:
1. Client opens up the TCP connection and sends a request to the server.
2. Server responds to the request.

Most common typ of a web client is an internet browser. Whenever you type an address into a web browser it sends an HTTP GET request to that server.

[^1] HTTP/3 mixed up things a bit by allowing server to push resources to the client once the connection is established.

### How to define the "service"

[Cambridge dictionary](https://dictionary.cambridge.org/dictionary/english/service) defines *service* as: “a government system or private organization that is responsible for a particular type of activity, or for providing a particular thing that people need”.

When looking at the defitinition from systems perspectice, then we can say it is something that can either:
- Gives us something that we want.
- Performs us some action that we want.

### Web-service

Web service can be loosely defined as a program that works over HTTP(s) and can be interacted with using machine-readable content types (formats).

Emphasis on machine-readable. It could be said that what differentiates web-sites from web-services is that web-site is intended to be interacted with by humans and web-services is intended to be interacted with by machines.

### URL structure

A sample URL `https://github.com/smagurauskas/software-engineering?something=maybe` could be deconstructed into the following parts:

1. `https://` which denotes the protocol used for communication, in this case it is `https`.
2. `github.com` which denotes domain or a host. It can be further divided into `com` being the top level domain (TLD), `github` being second level domain and so on.
3. `smagurauskas/software-engineering` which denotes the path.
4. `something=maybe` which denotes the query parameters. Query parameters act as key-value pairs, where `something` is the key, and `maybe` is the value. Multiple query paremeters can be provided by chaining them with the `&`.

URLs are defined in the [RFC 3986](https://www.rfc-editor.org/rfc/rfc3986) memo.

### HTTP

HTTP stands for HyperText Transfer Protocol. It is an application layer protocol build on top of TCP communications protocol.

HTTP allows to make requests by specifying the request method, path, query parameters and various accompanying headers.

Methods are typically used to identify what action should be performed on the specific resource with the request.
Headers provide additional information like how the request should be interpreted (read), how the server should respond, authorization information and much more.
Query allows to pass additional information for handling the request.

Some of the HTTP methods can have request *bodies*. Payloads of the requests are usually provided in the request bodies.

### Sample HTTP request

HTTP request in plain text looks like:

```text
GET  HTTP/1.1
HOST: github.com
Accept: text/html
```

Although in practice you almost never form the requests yourself, but rather use some HTTP library which abstracts most of the internals.

### HTTP methods

Some of the most frequently used methods are:
- `GET` - the "default" methods (at least for browsers) to retrieve the resource.
- `POST` - method for creating a resource or invoking a command. Has body where payload can be transferred.
- `PUT` - method for updating the resource, has a body similarly to `POST`.
- `DELETE` - method for deleting the resource.
- `OPTIONS` - used by browsers for CORS requests. [See more about CORS.](https://developer.mozilla.org/en-US/docs/Web/HTTP/CORS)

There various other methods, each of which has their canonical meaning and use case assigned to it. Read more about them in [RFC 9110](https://www.rfc-editor.org/rfc/rfc9110.html#name-methods).

HTTP Methods can be further divided into idempotent and non-idempotent methods. Idempotent methods are the methods that do not change the state of the system, meaning that they can be be safely called multiple times and same result should be received everytime (assumin nothing else change the state). Primary example of idempotent methods is `GET`. Non-idempotent methods are methods that do change the state of the system, an example would be `POST`.

### HTTPS

HTTPS is an extension of HTTP with an added Transport Layer Security (TLS) which allows encryption of messages in transit. HTTPS allows encrypting HTTP requests so that only the receiver could decrypt it. If another party were to see the request in transit, then it would not be able to make any sense to it.

HTTPS relies on Certificate Authorities (CA) for issuing certificates. A well known CA issues a certificate for a website and the client can check with the CA if it really did issue that certificate. This model relies on the notion that there is only a very limited amount of CAs present and unlimited amount of websites available. Operating systems *typically* come bundled with predefined list of CAs that are trusted. HTTPS client can then locally check if the certificate that was provided by the server is is correctly signed by one of the CA.

[Read more on SSL/TLS here.](https://security.stackexchange.com/questions/20803/how-does-ssl-tls-work/20833#20833)

In the past due to the fact that there are a limited amount of well known CAs, getting a HTTPS certificate used to be a quite expensive. Currently there are non-profit CAs like [Let's encrypt](https://letsencrypt.org/) which issues certificates for free.

Due to certificates being so easy to obain nowadays, it is considered a bad practice not to run a production system on HTTPS.

### What is an API

API stands for Application Programming Interface. Web services are considered to be APIs, but API term is not limited to web services. 

APIs can be created via other mechanisms than Web. An example of non Web API could be IPC mechanisms like named pipes and memory mapped file, where one process writes to file that is stored in memory and other processes can read from that file.

There is a good stack overflow answers on the topic how APIs relates to web services:

> An API (Application Programming Interface) is the means by which third parties can write code that interfaces with other code. A Web Service is a type of API, one that almost always operates over HTTP (though some, like SOAP, can use alternate transports, like SMTP). The official W3C definition mentions that Web Services don't necessarily use HTTP, but this is almost always the case and is usually assumed unless mentioned otherwise.

> For examples of web services specifically, see SOAP, REST, and XML-RPC. For an example of another type of API, one written in C for use on a local machine, see the Linux Kernel API.

> As far as the protocol goes, a Web service API almost always uses HTTP (hence the Web part), and definitely involves communication over a network. APIs in general can use any means of communication they wish. The Linux kernel API, for example, uses Interrupts to invoke the system calls that comprise its API for calls from user space.

[https://stackoverflow.com/questions/808421/api-vs-webservice/808467#808467](https://stackoverflow.com/questions/808421/api-vs-webservice/808467#808467).

### HTTP Content types

HTTP has `Content-Type` header which denotes what format (type) the content of the message is and it is used for both requests and responses.

Value of `Content-Type` header is called media type or MIME type. MIME stands for Multipurpose Internet Mail Extensions. MIME types are defined in [RFC 6838](https://datatracker.ietf.org/doc/html/rfc6838).

`Content-Type` header has a structure of `type/subtype`. It has additionally be follower by parameter following `;` after `subtype`.

`type` generally indicates what kind of content the message is going to contain. Among other types there are such as `text` and `application`. `text` type messages are intended to be consumer by humands, as they should be human readable. `application` types are structured so that they could be consumer by other applications, although that does not prevent them being human readable, it is just that their goal is to be parseable.

[More on MIME types](https://developer.mozilla.org/en-US/docs/Web/HTTP/Basics_of_HTTP/MIME_types).

### Machine readable formats

Although there are [multiple `application` types](https://www.iana.org/assignments/media-types/media-types.xhtml#application) currently most common ones are `application/json`, `application/xml` and `application/yaml`. Of these the most popular by far is `json`.

See more in [tag popularity in stack overflow questions](https://trends.stackoverflow.co/?tags=json,xml,yaml).

#### XML

Stands for Stands for eXtensible Markup Language.

Standard format for Web API protocols such as SOAP. But with fall in popularity of SOAP and related protocols, so fell the popularity of SOAP, and it is not as popular for new developments currently. 

Fall in popularity also coincides with XML not being easily deserializable into typical object oriented languages, because of it's attribute structure.

XML also has accompanying standards like XSLT which allows to transform XML documents into different ones, and XSD which allows to define a schema against which the XML document can be validated.

##### XML sample

```xml
<Courses>
    <Course name="Software Engineering" description="...">
        <Subject>Web Services</Subject>
        <Subject>APIs</Subject>
    </Course>
</Courses>
```

#### JSON

Stands for JavaScript Object Notation. Native to JavaScript language, as in serialized content could be directly pasted into JS script and would work.

Grew in popularity due to prevalence of JavaScript.

Due to it's simplicity it is pretty easy to parse JSON files and in does not create much mental overhead.

##### JSON sample

```json
{
    "Courses": 
    [
        { 
            "Name": "Software Engineering",
            "Subjects": 
            [
                "Web Services",
                "APIs"
            ]
        }
    ]
}
```

## Common Web Service Architectures

This provides an overview of some of the more common API architectures out there. The term "API architecture" is not definitive, and it can (and is) used interchangeably with the terms "protocol" or "standard." However, they all refer to the same thing.

The goal of API architecture is to define the constraints against which APIs are modeled, in turn making some implicit or explicit trade-offs. They provide guidelines on how the API should allow users to interact with business logic and what the requests or responses should look like.

#### REST

REST is an acronym that stands for **Re**presentational **S**tate **T**ransfer. REST is a stateless web service architecture and makes heavy use of HTTP protocol.

REST was originally defined in [Roy Thomas Fielding dissertation](https://ics.uci.edu/~fielding/pubs/dissertation/fielding_dissertation.pdf) in 2000.

##### REST's Uniform Interface

REST provides the "Uniform interface", which is the most important part of what makes an API "REST".

Uniform interface provides such constraints:
- Resources are identified by their URIs.
- HTTP standard is used to describe communication and actions.
- Resources representations are uncoupled from their internal representation.
- All the related resources must be navigable from any resource.

##### Resources are identified by their URIs

URIs fully describe the resources, including the protocol, location and resources themselves. In REST every individual resource must have an URI that would allow to interact with it.

Typically that includes nothing more than URL i.e. `https://mif.vu.lt/location#resource`. URL by definition does not include the final part of the example string (`#resource`), while URI does.

For example - if the study program has 10 courses, then every single one of the courses should have a URI, which identifies exactly it, i.e. `https://mif.vu.lt/software-engineering/se-1`.

[URL vs URN vs URI](https://www.pierobon.org/iis/url.htm).

##### HTTP standard is used to describe communication and actions

HTTP verbs are used to define the action on requested resource.

Meaning that:
- `GET https://mif.vu.lt/software-engineering/se-1` - should return the representation of resource.
- `DELETE https://mif.vu.lt/software-engineering/se-1` - should delete the resource.
- etc.

##### Resources representations are uncoupled from their internal representation

In practice this means that if the server must return proper `Content-Type` header that would explain to client how to parse the message.

Analogously client could request different `Content-Type` via `Accept` header and that should also be *fundamentally* supported by the server. *Fundamentally* in this case means that it does not mean that the server can expect any niche media format specific in `Accept` header, but it means that the implementation is detached in such a way that this should be possible in the server.

##### All the related resources must be navigable from any account

The most complicated constraint of Uniform Interface. Engineers tend to avoid this part, because of complexity of it's implementation, however the RT Fielding highlighted that it is an essential part of REST.

In practice it means, that relates resources should be linked via their URIs in the representation:
```json
{
    "links": {
        "self": "https://domain/account/1",
        "next": "https://domain/account/2",
    },
    "account": {
        "owner": "Person name",
        "account_number": 123,
        "links": {
            "transfers": "https://domain/account/1/transfers",
            "withdrawals": "https://domain/account/1/withdrawals"
        }
    }
}
```

#### GraphQL

GraphQL is a query language for APIs. GraphQL is typically served over HTTP, but the exact protocol of how it is served is not fully defined yet, but there is a draft version in the works at https://github.com/graphql/graphql-over-http.

Biggest advantage of GraphQL is that it allows to request the properties that the client wants explicitly, and via full navigational graph path. For example, given this request:

```gql
{
    hero {
        name
    }
}
```

the response would only include:
```json
{
    "data": {
        "hero": {
            "name": "R2-D2"
        }
    }
}
```

but it does not mean that `hero` only has `name`. A hero can have much more properties, but only the ones that are requested are returned. This provides a lot of flexibility from the client and from the server side.

To change the resources GraphQL uses "mutations", which are defined in very similar syntax to queries.

https://studio.apollographql.com/public/star-wars-swapi/variant/current/explorer provides a nice playground to test and try out the GraphQL and how it works.

##### N+1 problem

Biggest concern with GraphQL is that it shifts the N+1 problem from the client side to the server side.

In essence the N+1 problem means that if the main resource (for instance `movie`) has relation to 5 other resource (for instance `actor`), then it would result in 6 (5 + 1) queries. It is easy to see how time-space complexity of GraphQL implementations can explode quadratically because of this.

To work around this problems GraphQL frameworks typically have Batch Loaders or similar capabilities to batch requests against the data source. This *usually* provides more upfront development effort on the API side, but potentially saves overall effort during the total product development.

#### SOAP

SOAP stands for **S**imple **O**bject **A**ccess **P**rotocol.

SOAP is an old API architecture that is still running in multiple legacy systems, but hardly any development is happening with it.

SOAP usually uses WSDL (Web Services Description Language) to describe it's services and facilitate code generation.

SOAP is based on XML by the standard and has very specific request-response structure.

#### gRPC

gRPC is a remote procedure call framework (hence the RPC). gRPC uses protocol buffers `.proto` to define the interfaces. gRPC also allow bidirectional streaming.

As seen in https://grpc.io/docs/what-is-grpc/core-concepts/, gRPC allows to define four kinds of service methods:

Unary:
`rpc SayHello(HelloRequest) returns (HelloResponse);`

Server streaming:
`rpc LotsOfReplies(HelloRequest) returns (stream HelloResponse);`

Client streaming:
`rpc LotsOfGreetings(stream HelloRequest) returns (HelloResponse);`

Bidirectional streaming:
`rpc BidiHello(stream HelloRequest) returns (stream HelloResponse);`

`.proto` file example from https://learn.microsoft.com/en-us/aspnet/core/grpc :

```text
syntax = "proto3";

service Greeter {
  rpc SayHello (HelloRequest) returns (HelloReply);
}

message HelloRequest {
  string name = 1;
}

message HelloReply {
  string message = 1;
}
```

## Summary

Web service is a program that:

1. Works over HTTP(s) protocol.
2. Exposes an interface machine readable content types that is intended to be used by other programs.
3. Provides some kind of a service.

### Further reading
- [On misconceptions of what the REST is and is not - https://twobithistory.org/2020/06/28/rest.html](https://twobithistory.org/2020/06/28/rest.html).