Skip to content

mtanzim/gopl-book

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

67 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

gopl-book

Book

Why?

  • Refresher on Golang basics
  • Improve understanding of Go concurrency mechanisms

Related efforts

Notes on critical concepts

Interfaces

  • TODO

Goroutines and Channels

  • Go allows for two styles of concurrent programming
    1. Communicating sequential processes (CSP): passing values between independent activities (goroutines)
    2. Shared memory multithreading (threads in most mainstream languages)

Goroutines

  • Each concurrently executing activity is called a goroutine
  • For example, imagine 2 independent functions
  • In a sequential program, the functions are called one after the other
  • In a concurrent program, both functions are active at once
  • Goroutines are similar to an OS thread. This is a fair assumption for writing correct programs. Differences will be described later
  • Upon a program start, the only goroutine is the main function; this is known as the main goroutine
  • New goroutines can be created with the go statement, ie:
f() // call f, await return
go f() // create a goroutine to call f(), DO NOT wait

Channels

  • Goroutines are the activities of concurrent programs, and channels are the connections between them
  • It is a communication mechanism through which one goroutine can can send values to another goroutine
  • Channels carry value of an element type
  • The built in make function can create channels
ch := make (chan int)
  • Channels are reference types similar to maps and slices. Therefore, copying channels, passing them as arguments copies a reference referring to the same data structure
  • The zero value of channels is nil
  • Channels allow two operations, send and receive, collectively known as communications; both use the <- operator
ch <- x // a send statement
x = <- ch // a receive statement
<- ch // receive, discard result
  • Channels also support a close operation
  • Closed channels indicate that no more values will be sent; subsequent attempts at send will panic
  • Closed channels can be received from until drained, and all values after will be the zero value of the channel element type
  • Channels can be buffered or unbuffered, unbuffered channels have non-zero capacity; details will be explained in the following section
ch = make(chan int) // unbuffered channel
ch = make(chan int) // unbuffered channel
ch = make(chan int, 3) // buffered channel with capacity of 3
Unbuffered Channels
  • Unbuffered channels are a synchronization mechanism, and are often called synchronous channels
  • A send operation on an unbuffered channel blocks the sending goroutine until the message has been received by another goroutine
  • Similarly, if a receive is attempted first, the receiving goroutine blocks until a message is sent on the channel by another goroutine
  • Note spinner2 for an example of a simple unbuffered channel
  • Channels can often carry no value, and simply be used for synchronization, in these cases, we call messages events
  • events are conventionally denoted with the element type of struct {}, however, bool and int are also common
  • Note spinner3 for an example of a simple unbuffered channel with events
Pipelines
  • Channels can be used to create a pipeline between asynchronous processes
  • See the following pipeline example, also demonstrated below in the diagram

Pipeline

  • Channels can be closed if it it is important for the sender communicate that no more values will be produced
  • Sending on a closed channel causes a panic
  • After the closed channel is drained, subsequent attempts at receiving will yield a zero valye of the element type
  • The receiving operation can check if a channel is closed, ie:
for {
  x, ok := <- naturals
  squares <- x*x
  if !ok {
    break
  }
}
  • A more convenient syntax is as follows:
for x:= range naturals {
  squares <- x*x
}
close(squares)
  • See the complete example in pipeline2
  • Note that the garbage collector will clean up unreachable channels, and therefore it is only important to close them if we wish to communicate that receiving goroutines will no longer need to wait for a value on the channel
  • Additionally, channels can be marked as unidirectional, ie:
func counter(out chan<- int){
  for x:=0; x < 100; x++ {
    out <-x
  }
  close(out)
}
  • Note the full example in pipeline3, particularly the automatic type conversion from bidirectional to unidirectional channels
  • The reverse conversion is not permitted or valid
Buffered Channels
  • Buffered channels hold a queue of values (FIFO), sending adds a value to the back, receiving removes a value from the front
  • See the following diagram for the blocking mechanism on send/receive

Buffered channels

  • A buffered channel can be instantiated with
ch = make (chan string, 3)
  • len can be used to query the buffer capacity
  • Buffered channels are used to communicate between different goroutines, do not use them as a simple queue as there is a risk of deadlock
  • The following is an example of buffered channel usage
func query() {
  responses := make(chan string, 3)
  go func() {responses <- request("https://httpbin.org/image/jpeg")}()
  go func() {responses <- request("https://httpbin.org/image/svg")}()
  go func() {responses <- request("https://httpbin.org/image/png")}()
  return <- responses
}
  • Note that the usage of on unbuffered channel will cause the two slower goroutines to have no one to receive their message and therefore getting stuck forever; this is known as a goroutine leak
  • The full example can be found here
  • Note that the choice of buffered vs unbuffered, as well as the capacity of the buffered channel affects both performance and correctness of the program; watch for goroutine leaks that may add up over time to cause hangs and slowdowns
  • Goroutine leaks are not automatically collected by the garbage collector
Looping in parallel
  • A very common concurrency pattern is to run iterations of a loop in parallel
  • Examples of this can be seen in the thumbnail generator example
  • Note particularly the makeThumbnails6 function, for which we perform the following steps:
    1. Make channels for input and output
    2. Fire off a function concurrently, passing the communication channels
    3. Produce on the input channel from the main function
    4. Listen on the input channel inside makeThumbnails6, perform all required tasks, and signal closure on the output channel with the help of sync.WaitGroup. This is a special counter that safely increments and decrements as goroutines start up and close off.
    5. Iterate over the output channel inside the main function and summarize the results, which is summing the number of bytes written in this case
  • Also note the concurrent web crawler example. Highlighting the key concepts used:
    • Crawler goroutines are fed through the unseenLinks channel
    • The seen map is confined to the main goroutine, this ensures correctness and prevent unnecessary information sharing
    • Links found by crawl are sent to the worklist from yet another goroutine to avoid deadlocks
Multiplexing with select
  • We can listen from several channels and multiplex based on their responses using a select statement
  • select statements are like switch statements, where it has a number of a cases, and a default
  • Each case can specify a communication and its associated actions with a block of statements
select {
  case <-ch1:
    // ...
  case x: <-ch2:
    // do something with x
  case ch3 <- y:
    // ...
  default:
    // ..
  }
  • select waits until communication on some channel is ready
  • An empty select, ie: select{} waits forever
  • If multiple cases are ready, select picks at random
  • To avoid blocking behaviour if desired, ensure usage of default
  • See the countdown for example usage of select
Cancellations
  • There is no way for one goroutine to stop another, doing so would leave shared variables in an undefined state
  • In the countdown, we sent a single value over a channel, but note that a message over a channel can only be consumed once. Thus, how do we signal multiple goroutines to stop?
  • One solution may be to send multiple cancel messages, but it is difficult to know how many goroutines may be operating at a time
  • Note that once a channel is closed and drained, all subsequent receive operations yield a zero value
  • We can exploit this behaviour for a broadcast message
  • See broadcast for an example

Concurrency with shared variables

  • TODO

Testing

  • Similar to other aspects in the language, Go has a very minimal approach to testing frameworks:
    • Writing test code is very similar to writing the program itself
    • Write short functions focusing on a single task
    • Be careful about boundary conditions
    • Be careful about selecting and using data structures
    • Think about what inputs result in what outputs
  • Due to the similarity, the belief is that the same conventions, notations and tools can be applied to testing as well as writing Go code
  • go test is a the test driver for Go packages
  • Files suffixed with _test.go are not part of the go build process, but rather the go test process
  • Test files can contain:
    1. Test functions that result in either PASS or FAIL
    2. Benchmark functions
    3. Example functions
  • Let's focus on test functions first

Test Functions

  • Test files have to import the testing package, and must follow the follow signature:
func TestName(t *testing.T){
  //
}
  • The optional suffix Name must also begin with a capital letter, ie:
func TestSin(t *testing.T){ }
func TestCos(t *testing.T){ }
  • See the palindrome tests for a very basic example
  • To run the above tests, we can simple navigate to the package and run go test
cd ./ch11/word1/
go test
  • On the latter 2 tests, note the usage of t.Errorf to avoid repetition
  • Also note that 2 of the tests fail. If we want to only run the failing tests, we may use the -run flag, which takes a regular expression, ie:
cd ./ch11/word1/
go test -v -run="French|Canal"
  • As one may note, writing tests like this becomes tedious, and thus, table driven tests are very popular in go
  • One may use t.Fatalf if a test function must be stopped in its tracks; this is generally not recommended

Testing a command

  • The go test tool also allows testing commands alongside packages with little effort
  • Although the main package normally generates an executable, it can also be imported as a library
  • See here for an example, noting how we mocked out the global variable out to capture the output in the tests
  • Despite the package being main, during tests, the package acts as a library exposing the Test function to the test driver
  • Take care not to call log.Fatal or os.Exit in the functions we are testing; these are to be reserved for the main function

White box testing

  • Black box and white box testing are both popular methods in software development
  • Black box testing assumes nothing about the internal implementation; rather it tests against the specified documentation, interfaces and API
  • Black box testing is great for empathizing with the users of the code to discover API flaws. It is also more robust as it requires less maintenance as the software evolves
  • White box tests on the other hand have access to internal structures of the code which may otherwise be unavailable to clients
  • White box testing can be very useful for testing the trickier parts of the code, and ensure internal invariants hold
  • Looking back tests for IsPalindrome is an example of black box testing
  • Tests for simpleMath is an example of white box testing
  • With simpleMath, particularly note how the global variable out is replaced, or "faked" during testing
  • Fakes such as the above provide many advantages, as they are easy to predict, observe and configure
  • Fakes also help avoid side effects such as updating production databases, interacting with external clients etc.
  • Let's take an example; we can see how this can be difficult to test without setting up the correct infrastructure to send emails
  • Now, let's take a look at the subsequent example, noting how there is a global variable notifyUser that can be faked during tests
  • Looking at the test, we note the following
    • We are mocking out global notifyUser variable, and then using the fake for assertions
    • However, we are careful to restore the global variable with the following code snippet
	saved := notifyUser
	defer func() { notifyUser = saved }()
  • This pattern can be used to save and restore many types of global variables, including flags, debugging options, performance parameters etc.
  • Keep in mind however that this patter works since go test does not normally run tests concurrently

External test packages

  • The Go specification forbids dependency cycles, however, when testing hierarchical packages, a cyclical dependency may be required
  • This is especially true when one wants to perform integration tests
  • To allow for this, one can define external test packages

External test package

  • However, when white box testing, the external test package may need privileged access to private functions
  • To allow such a backdoor, one can create a file for a package called export_test.go where it may contain code such as
package B
var privateFn = PrivateFn
  • Note how the above file would not contain an tests, rather, simply expose a private variable to the external test package
  • The following commands may be used to see a list of files matching the varying categories we discussed
    • go list -f={{.GoFiles}} fmt: all files that go build would include in the fmt package
    • go list -f={{.TestGoFiles}} fmt: all files ending in _test.go that are only built during tests
    • go list -f={{.XTestGoFiles}} fmt: are also test only files, but note that these must import the fmt package

Benchmarking

  • TODO

Profiling

  • TODO

Example functions

  • TODO

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published