# Web Technologies

## Service Oriented Architecture (SOA)

### What is SOA?

- Think of our daily lives. We book rides, order food, check weather etc via phone, or internet. These are services that we employ to achieve certain functions

- In the same way, software can also use services to achieve certain goals. That is, you use a tool (deployed by someone else) to accomplish some task instead of doing it yourself. Or, you deploy a tool for others to use to accomplish something

- Why is service orientation important?
    - Web services
        - Think about developing an app for travellers. You want to tell them the best interest rate, the lowest hotel prices, the lowest air fares
        - To do this, you can (i) write scrapers, databases, engineering pipelines to pull such data yourself, or (ii) just use something that is already available online
        - Obviously, using (ii) allows you to do this much more easily
        - All your app needs to do is combine these services together!
        ![image.png](attachment:image.png)

    - Large organisations
        - In many parts of the organisation, similar problems are faced.
        - If everyone needs to re-use other people's code, typically they are billed a certain amount to account for the cost of offering that support
        - You can email the team that manages the services each time you want something done, let them do it, and bill you manually OR the team exposes it as a service, you hit them, and are charged for each pull
        - Obviously, the latter is much easier

- Promotes modularity, extensibility, code reuse

- The main idea is to separate the responsibility of the code into conceptual chunks, and expose it to others. This enables others to combine these chunks to potentially new domains easily



### What makes good SOA?

- Modular and loosely coupled
- Reusable and Combinable
    - Services meant to be mixed and matched
- Composable
- Platform and language independent
    - Done by following standard communication protocols 
        - XML file / HTTP request
- Self-describing
    - WSDL (web service description language)
- Self-advertising
    - clients must know what is available
    - UDDI (Universal Description, Discovery and Integration) to connect service providers with potential service requesters

### History of websystems

- Internet started from ARPANET

- Then small networks formed across research institutes in US, which began to connect to each other

- Next, the world wide web (WWW) was proposed and built on top of the internet

- Earlier, HTML was the primary means of dealing with sites (all sites were static and precompiled)
    - Request hits server --> server returns html --> browser renders it
    - The request and response are in http, so understood by both browser and server

- Next, came dynamic web pages. So instead a having the server sending a static unchanging site plan, you have web pages that are generated at the time of access
    - So nothing exists if there are no requests
    - Page is "put together" when queried, the html is sent to the browser, which pieces it together for the user
    - Dynamic sites are much more easily customisable

- Next came web apps. Instead of running on code stored locally, web apps run in your browser, and have their source code stored remotely
    - They can be run anywhere so long as the browser is compatible
    - No need to download and maintain software locally
    - e.g. google doc

- These modern apps rely on other services to display things, and they communicate using open standards like http, xml, json
    - e.g. if you construct a personal web page, you may want to pull your linkedin profile, your latest blog post, copy of your latest resume etc
    - So you compose all of these from microservices

### Websystems Architecture

- We talked about layered architecture in an earlier segment, which implements separation of concerns. 

- Websystems are a prime example of this
    - ![image.png](attachment:image.png)
    - ![image.png](attachment:image-2.png)

### HTML/XML/JSON

- HTML: markup languages to express and structure content
    - Think of this as skeleton of a site
    - For styling, use CSS
- XML: Store and transport data, usually used to send structured data
- JSON: lightweight data interchange format used by many webapps/services
    - Easy to convert to and from javascript objects, so more popular in web dev over XML


### HTTP

- HyperText Transfer Protocol (HTTP)
    - Originally used to link html documents. Now it links both static and dynamic resources

- Relies on URLs (universal resource locators) and URIs (universal resource identifiers)
    - URLs are a subset of URIs
        - Both contain information about identifying resource, but URL also provides protocol, domain name (IP address) of the machine the resource is stored on, and the location of the resource on that machine
        - Typically, we don't give IP address info to a URL/URI, because the browser resolves the IP address corresponding to the host name
            - i.e. if it is common, it will know where to direct you
            - if it is not common, it will lookup a Domain Name System (DNS) server to find out

- HTTP sends messages through Transmission Control Protocol (TCP) ports, usually port 80

- Request format usually comprises (i) request line, (ii) headers, (iii) blank line, (iv) message body
    - Request line usually includes request method, request URI, protocol
    - May end with some query string separated from the first part of the URL with a `?`
    - Number of headers and types of headers can vary, because optional headers exist
        - Mandatory headers
            - Host: Contains domain name/ip address of the host
            - Accept: Inform the server what content the client will accept as response
            - Content Length: If a message body exists, this is mandatory to indicate the size of body in bytes
            - Content Type: Indicate the type of body 
            - Message Body: If needed. Can be HTML document, JSON, encoded params etc
- Response format comprises status-line, headers, blank line, then optionally a message body
    - Status line: Protocol version and http code (200 Ok, 400/500 error etc)
    - Content Length: If a message body exists, this is mandatory to indicate the size of body in bytes
    - Content Type: Indicate the type of body 
    - Message Body: If needed. Can be HTML document, JSON, encoded params etc

- HTTP usually limits characters that can be used in a URI, request query, and request body to ASCII characters
    - Anything else is represented as unicode (e.g. %20 instead of whitespace)
    - Uses `=` to make specific queries, and `&` to chain multiple query conditions

- 3 usual methods
    - GET: Retrieve the resource requested in the URI
    - POST: Add or modify resources, usually used to submit data
    - PUT: Create or update resource at the **specified URI**. Unlike POST, which creates resource at location determined by server

- HTTP is stateless
    - Relationship between requests is not preserved
        - i.e. if you make 2 consecutive requests, HTTP has no means of identifying that you are making a second request 
        - Usually, this is accomplished with site cookies

    

### Javascript

- Makes webpages interactive
   - Embedded between \<script\> \<\\script\> tags in HTML documents  

- Without javascript, any interaction requires exchanging message between server (GET or POST request) to enable changes to the static page
   - With javascript, you can dynamically change the UI on client side, with processing done on client side rather than on server side


### Remote Procedure Call (RPC)

- With the growth of cloud compute, everyone is communicating in a massive private network at work
    - The client and server environments are often different. Some people may use Windows locally, while servers use Linux
    - This is because client and server environments serve different purposes. Some clients have environments optimised for specific subset of users tasks, and servers are optimised for sysadmins and other professionals

- Think about the modern network as a tiered construct (recall: n-Tier architecture)
    - Different layers specialise in different things; backend handles compute, front end handles specific user needs etc
    - This specialisation is what gives rise to differences in environments 

- As such, it can sometimes be tricky for clients and servers to communicate effectively
    - This is the basis for **middleware**, which facilitates such communication. Think of it as a `mediator` design pattern
    - **Remote Procedure Call (RPC)** is the basis for modern middleware

![image.png](attachment:image-2.png)

- There are many examples of middleware components, but RPC is the most common 
    - Allows clients to invoke procedures that are implemented on server. Can either be 2 completely separated machines, or different virtual instances on the same machine
        - In both cases, then physical address space between both machines are different. Either because it is not the same machine, or it is "segregated" instance

- RPC has 3 components
    - Client (caller)
        - Makes the call
    - Server (Callee)
        - Implements procedure that has been invoked
    - Interface Definition Language (IDL)
      - Language used for communication between client and server
      - It tells the client what remote services are available, how they are accessed and what the server will respond with

![image.png](attachment:image.png)

- Client stub
    - Establish connection with server through **binding**
    - Format data to some standardized structure (e.g. XML)
    - Send remote procedure call
    - Receive server stub's response

- Full RPC Process
    - ![image.png](attachment:image-3.png)

### Object brokers

- When object oriented programming first came out, it differed from previous procedural based languages in that the signature of procedure had to be unique. With OO, inheritance and polymorphism can lead to different implementations of the same method signature

- This creates issues for middleware, because middleware has to direct a caller to the right object, accounting for inheritance/polymorphism in distributed computing
    - Hence, this created the need for an `object broker`
    - Most famous standard is `CORBA`

![image.png](attachment:image.png)

![image.png](attachment:image-2.png)

