# What is a socket?

**Socket**: a way to speak to other programs using Unix file descriptors.

Communicate through using ***send()***, ***recv()*** socket calls.

## Internet sockets

### Stream sockets (SOCK_STREAM)

Stream sockets are reliable two-way connected communication streams.

If you output two items into the socket in the order “1, 2”, they will arrive in the order “1, 2” at the opposite end.

Stream sockets use TCP (The Transmission Control Protocol). TCP makes sure your data arrives sequentially and error-free.

### Datagram sockets (SOCK_DGRAM)

Datagram sockets sometimes called "connectionless" sockets.

Datagram sockets use UDP (User Datagram Protocol)

They use when a TCP is unavailable or when a few dropped packets is acceptable (unreliable application likes games, audio, video..)

**Advantage**: Performance is much better than Stream sockets.

###  Low levels

Data encapsulation

Layered Network Model
* Application Layer (telnet, ftp, etc.)
* Host-to-Host Transport Layer (TCP, UDP)
* Internet Layer (IP and routing)
* Netword Access Layer (Ethernet, wi-fi,etc.)

## IP Address, structs and Data Munging

### IP Address

The Internet Protocol Version 4 - IPv4 is a internet routing system (Ex: 192.0.2.111). Build in 32-bit, produce $2^{32}$ addresses.

IPv6 (2001:0db8:c9d2:0012:0000:0000:0000:0051). 128-bit, $2^{128}$ addresses.




### Subnets

192.0.2.12/26

192.0.2: Network \
12: Host (Host 12 on Network 192.0.2.0 - bitwise AND with 255.255.255.0) \
26: Subnets

### Port Numbers

The address is used by TCP (stream sockets) or UDP (datagrams sockets). 16-bit.

###  Byte Order?

### structs

struct **addrinfo**

In [11]:
class SockAddr:
    # address family
    sa_family: str
    # contains a destination address and port number of socket
    sa_data: str

class InAddr:
    s_addr: int  # that's a 32-bit int (4 bytes)

class SockAddrIn:
    sin_family: int
    sin_port: int
    sin_addr: InAddr
    sin_zero: str

class InAddr6:
    s_addr: int

class SockAddrIn6:
    sin6_family: int
    sin6_port: int
    sin6_flowinfo: int
    sin6_addr: InAddr6
    sin6_scope_id: int

class SockAddrStorage:
    """
    Designed to be large enough to hold both IPv4 and IPv6
    """
    ss_family: any
    __ss_pad1: str
    __ss_align: int
    __ss_pad2: int

class AddrInfo:
    ai_flags: int 
    ai_family: int
    ai_socktype: int
    ai_protocol: int
    ai_addrlen: int  #size of ai_addr
    sockaddr: SockAddr
    ai_canonname: str
    addrinfo: AddrInfo

### Private (Or Disconnected) Networks

# 5. System Calls or Bust

## 5.1 getaddrinfo() - prepare to launch

It helps set up the _structs_ you need later on

In [1]:
import struct

def getaddrinfo(
    node: str,  # eg: 'www.example.com'
    service: str,  # eg: 'http' or number
    addrinfo: struct.Struct,
):
    """
    Give this func three input parameters,
    and it returns a pointer to linked-list, res of results.
    Params:
        node: host name or IP address
        service: port number
        addrinfo: struct addrinfo, that filled with relevant information
    """
    pass

## 5.2 socket() - Get the file descriptor

Simply return _socket descriptor_ can be used later system calls.

In [5]:
def socket(
    domain: int,  # PF_INET or PF_INET6
    type: int,  # SOCKET_STREAM or SOCK_DGRAM
    protocol: int,  # protocol can set to 0 to choose the proper protocol for given type (tcp, udp)
):
    """
    Return:
        socket descriptor that use in later system calls.
    """
    pass

## 5.3 bind() - What port am I on?

Once you have a socket, you might have to associate that socket (_sock descriptor_) with a port on your local machine.

In [None]:
def bind(
    sockfd: int,
    my_addr,
    addrlen: int,
):
    """
    Params:
        sockfd: socket file descriptor return by socket()
        my_addr: a pointer to struct sockaddr that contain about your addr, namely, port, IP address.
        addrlen: the length in bytes of that address.
    """
    pass

## 5.4 connect() - Hey, you.

In [6]:
def connect(
    sockfd: int,
    serv_addr,
    addrlen: int,
):
    """
    Params:
        sockfd: our socket file descriptor
        serv_addr: description port and IP
        addrlen: leghth of server address
    """
    pass

## 5.5 listen() - Will somebody please call me?

In [7]:
def listen(
    sockfd: int,
    backlog: int,
):
    """
    Params:
        sockfd: socket file descriptor from the socket() system call
        backlog: the number of connections allowed on the incoming queue
    """
    pass

## 5.6 accept() - thank for calling

## 5.7 send() and recv()

## 5.8 sendto() and recvfrom()

## 5.9 close() and shutdown()

# Socket programing in Python

https://realpython.com/python-sockets/

In [None]:
import socket

HOST = '127.0.0.1'
PORT = 65432
UDP_PORT = 65430

def test_stream_socket():
    sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
    sock.connect((HOST, PORT))
    sock.sendall(bytes(1024))
    sock.recv(1024)
    sock.close()

def test_dg_socket():
    sock = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
    sock.sendto(bytes(1024), (HOST, UDP_PORT))
    message, addr = sock.recvfrom(1024)
    sock.close()

#  Slightly advanced

## Blocking

_accept()_ and _recv() are functions block. The reason they can do this because they're allowed to. When you first create the socket descriptor with _socket()_, the kernel sets it to blocking.

If you try to read from a non-blocking socket and there's no data there, it's not allowed to block, it will raise an exception.

If you put your program in a busy-wait looking for data on the socket, you'll suckup the CPU time.

## _poll()_ synchronous I/O multiplexing

What you really want to be able to do is somehow monitor a _bunch_ of sockets at once and then handle the ones that have data ready. This way you don't have to continously poll all those sockets to see which are ready to read.

In a nutshell, we're going to ask the OS to do all the dirty work for us, and just let us know when some data is ready to read on which sockets. In the meantime, our process can go to sleep, saving system resources.

_poll()_ is horribly slow when it comes to giant numbers of connections.

## _select()_ synchronous I/O multiplexing - old school

Problem: you are a server and you want to listen for incoming connections as well as keep reading from the connections you already have.

selects() gives you the power to monitor several sockets at the same time. It'll tell you which ones are ready for reading, which are ready for writing, and which sockets have raised exceptions, if you really want to know that.

_Warning:_ though very portable, _select()_ is terrible slow when it comes to giant numbers of connections.






## Tool tips
- List all open sockets: netstat
- View routing table: route, netstart -r

## SSL explaination

Serverside
- server read data from file
- server encrypts/compresses data
- server send() encrypted data

Client
- client recv() encrypted data
- client decrypts/decompresses data
- client writes data to file

## Write a server that accept shell command from client and execute them
Client
- connect() to server
- send("/sbin/ls > /tmp/client.out")
- close() the connection

Meanwhile Server
- accept() the connection from client
- recv(str) the command string
- close() the connection
- system(str) to run the command