Skip to content
Kyle edited this page May 1, 2018 · 37 revisions

Blockchain Simulator

Pydocs

Python documentation created using pydoc can be found at: https://kyles22.github.io/BlockchainSimulator/index.html

Design

Peer to Peer Communication

Nodes within the block chain network communicate using TCP and UDP on port 10000. To allow differentiating messages that are used on the same port, a combination of envelopes and framing are used. All messages transmitted on port 10000 are wrapped in an envelope.

Envelope

  • Request Type. The request type to differentiate the different types of messages.
  • Request Message. The binary data that can be decoded separately once the message has been routed to a function that can handle the corresponding request type.

Request Types

  1. Blob. The request type when binary data received from a client outside the network is flooded to the peers in the network.
  2. Mined Block. The request type when a node mines and block and floods the block to the peers in the network.
  3. Discovery. The request type that is used for a discovery message when a node broadcasts that it is alive to other nodes in the network.
  4. Resolution. The request type to request all block headers for a peer's chain when a block finds a higher cost chain in the network.
  5. Block Resolution. The request type to request the block at a specific index in the chain from another node including its body data for completing a chain once the resolution chain headers have been verified.

Framing

Because UDP is datagram-based, there is no need for framing because all messages sent using UDP are designed to fit in a single UDP segment. However, for messages sent over TCP on port 10000, length framing is used to differentiate between different messages in the stream. All messages in the stream are prefixed with a 4 byte length. This allows the length to be received first, followed by the rest of the message to be received based on the known length.

Nodes and Discovery

The block chain simulator consists of nodes that represent an individual processing unit across the network. Each node has a unique ID and heartbeat thread that broadcasts a heartbeat via UDP on port 10000 every few seconds. This heartbeat signal is used to discover new nodes and add them to the list of nodes called the node pool, and to determine which nodes are still in the network. Each node is responsible for maintaining its own node pool.

All nodes in the network have discovery endpoints listening for heartbeats. A soft state timer is used to remove nodes from the node pool that have not sent out a heartbeat recently. When a heartbeat signal is detected, if that ID is not in the node pool it will be added to it, or if the ID is a known node it resets the timer.

The miner, chain, and blocks

Each node continuously mines, or generates new blocks of data. A block consists of many components.

  • Difficulty. A difficulty or cost that is how long the block should take to mine based on its nonce. Blocks are mined at a constant rate by adjusting the difficulty to calculate the hash of the next block.
  • Timestamp. A timestamp of when it was mined.
  • Entropy. A random entropy number to avoid valid duplicates being generated.
  • Previous Hash. The hash of the previous block which allows the blocks to be formed in a chain.
  • Data Hash. The hash is generated from the data and previous hash.
  • Data. If data is not given to a node from outside the network then the data will be arbitrary.

Once a new block is mined, it is added to a list of blocks, called the block chain. The node that mined the new block will need to transmit the block to the other nodes in the network. While each node maintains its own block chain, every node should have the same block chain because of the communication of blocks.

Data Input

All nodes listen on port 9999 for a TCP connection for a client connection from outside of the network. This connection allows clients to send data to be stored in the block chain. Data is delimited with a new line character and has a maximum size of 65535 bytes.

After data is received from a client, it is forwarded on port 10000 using UDP to all discovered nodes. A timestamp is added to the data to prevent this message from being passed along infinitely. When a message is received the node checks to see if the data-timestamp pair is unique to the node. If the message is unique it is forwarded, otherwise it is discarded.

Data Output Requests

All nodes also listen on port 9998 for a TCP connection from a client outside of the network. This allows for clients to request for data from the block chain. Clients provide an index for which block they would like to receive the data and timestamp from.

Mined Block Messages

When a node successfully mines a block, it sends the block using UDP on port 10000 to all discovered nodes. The block data is wrapped in an envelope, which will allow it to be distinguished from client input data which is forwarded on the same port. The total cost of the block chain is also forwarded in this block message.

When a node receives a block it attempts to add it to the block chain. If the block can be added, it is forwarded to all discovered nodes, otherwise the block is discarded. The node checks the cost that was sent with the block to determine if it was from a chain with a higher cost than its own chain. If the cost is higher, this node is missing blocks or is on a branched off chain and must perform a chain resolution.

Chain Resolution

When a node receives a block message, it checks what the cost of the chain received. If the received chain cost is equal to or greater than the nodes current chain cost, the node enters a chain resolution protocol.

The chain resolution begins by creating a new chain with only the very first block and the latest received block added. The node then establishes a TCP connection on port 10000 with the node that sent the block. The node sends a request to return just the block headers for all blocks in that node's chain. The node inserts these headers into the new chain and verifies that the cost of this chain is correct and that it is a valid block chain. If the chain is not valid or costs less the message is discarded and the newly created chain is discarded as well.

If the received block headers confirm that total cost is greater than or equal to the node's original chain, the node will make a request for the missing block body data. The other node returns the body data of its chain, which is added to the newly created chain. Once the new chain has been filled, the node checks a final time to see if the cost is still greater than the original chain. If it is, the original chain is discarded and replaced with the new one.

Environment

Language

The project was implemented in Python 3.7. Python was used for a number of reasons. The language was familiar among group members which would help the project progress quickly. Python has numerous libraries available, such as the library used for hashing, which would help save development time. Lastly, this language offered cross-platform support for the group members.

Docker

Docker is a tool that operates on the idea of containers or running multiple applications on a single host. This allows the project to be run on a single machine with the ability to scale the number of nodes in the block chain network without needing multiple physical machines. Docker also allows the project to be run on different operating systems without having to worry about consistency with python versions and packages.

Testing Results

Results

The block chain simulation was tested on a virtual network to simulate how a larger network might operate. In ideal conditions the virtual network operates no different that a physical work but allows greater control of the nodes. This allowed for more effective testing by being able to scale the number of nodes easily. The following aspects of the project were tested.

  • Mining Blocks. A single node was able to mine new blocks and add them to the block chain.
  • Discovery of Nodes. When a new node was created existing nodes were able to discover it and add it to the node pool.
  • Sending Blocks. When a new block is mined it is able to be sent and received to known nodes.
  • Data Input. An external client was able to connect to a node and input data into the block chain.
  • Input Data Propagation. When data is input into the network it is able to be flooded to all known nodes in the network to be added to the block chain.
  • Data Output. An external client was able to connect to a node and request block data. The request was able to be processed and the data returned.
  • Leaving the Network. When a node leaves the network the remaining nodes are able to detect and remove them from their node pool.
  • Resolution Detection. A higher cost chain is able to be detected and the node is able to enter the chain resolution state.
  • Chain Forwarding. A node in the chain resolution state is able to receive a block chain from another node.

Known Bugs

In our testing we uncovered the following bugs.

  • Currently data input only handles ASCII data, so characters outside of this may result in a decode error when attempting to output the data for a particular block. This causes the block data to not be output.

Project Evaluation

Evaluation

Overall, the project was a success and we managed to achieve all goals we set out for the project on time. A block chain is able to created and verified while also taking input data from outside the network. The block chain network is able to be scaled by adding and removing nodes to and from the network without disrupting its operation. The block chain resilient to fabricated blocks of data because of its chain resolution handling.

Learning Opportunities and Obstacles

Throughout the development of the project our group ran into many challenges which lead to many learning opportunities for block chain technology. Our group learned about how nodes can achieve consensus without having a central authority. We learned about network discovery using broadcasting, and how this discovery can lead to further communication. Our group learned about handling communication in a larger peer to peer network.

The largest obstacle in our project was developing a chain resolution protocol. This proved to be very difficult since nodes can't trust data from any other node. Each node needs to be able to verify chains and handle both correct and incorrect data. Being able to resolve discrepancies is extremely important in a block chain because it relies on not having a central authority.

Future Recommendations

The current operation to discover new nodes uses a UDP broadcast. This prevents our block chain implementation from ever being supported on the internet since broadcast isn't supported for outside a LAN network. Looking at existing block chain implementations, such as BitCoin, and their discovery methods would be a valuable resource on expanding in this area.

Group Member Contributions

Allan Kerr - amk564:

Design and Implementation

  • Block, Chain, and Miner. Fully designed and implemented the block and chain structures to maintain and verify the chain of blocks and their hashes along with the miner that calculates a nonce to try and find a hash with enough 0's, recalculates the difficulty for the next block, and adds new blocks to the chain.
  • Block Flooding. Designed and implemented the protocol for the flooding of blocks using UDP between peers in the network when a node successfully mines a block.
  • Data Input and Flooding. Designed and implemented the TCP service for receiving client ASCII input on port 9999, attaching a timestamp to it, and flooding it to peers in the network on port 10000.
  • Data Output. Designed and implemented the TCP service for receiving client output requests where a block index can be requested on port 9998 and the contents of the block will be transmitted to the client.
  • Request Envelopes and Base UDP and TCP servers. Designed and implemented the base UDP and TCP services that handle requests and ports 9998, 9999, and 10000 that handle both peer to peer communication and communication with clients that are not part of the blockchain. This also includes the envelope design and routing for differentiating peer to peer messages on port 10000.
  • Chain Resolution. Designed and implemented the chain resolution protocol where a peer detects a chain with a greater or equal cost in the network, requests the resolution chain consisting of only block headers, validates the resolution chain, makes the corresponding block resolution requests to fill in any missing block body data once the chain has been validated, and replaces the node's current chain with the newly completed chain if it still has a higher cost.

Environment

  • Docker. Created the necessary Docker configuration to allow multiple nodes to run on a virtual network on a single machine.

Documentation

  • Quick Start Guide. Wrote the quick start guide for building, running, and interacting with the blockchain.

Dominik Pytlak - dmp452

Documentation

  • Design. Wrote the documentation for the design portion and created the images showing different types of communication.
  • Environment. Wrote the documentation for the environment that the design was implemented in.
  • Testing. Wrote the documentation for the testing that was done for the project.
  • Evaluation. Wrote the documentation for the evaluation on the project that was discussed as a group.

Kyle Seidenthal - kts135

Design and Implementation

  • Heartbeat and Discovery. Designed and implemented the system for node discovery using UDP broadcasting and a soft-state node pool to clean up any nodes that are no longer part of the network.
  • TCP Framing. Designed and implemented the length framing system used for recognizing TCP messages sent between nodes on port 10000.
  • Research and Design. Researched current blockchain technologies and protocols and designed blockchain structure.
  • Pydoc. Created the documentation html at https://kyles22.github.io/BlockchainSimulator/index.html
  • Acceptance Tests. Tested to verify that node discovery and chain resolution were correct