## Agile Hardware Design
***
# Open-Source Project Development

## Prof. Scott Beamer
### sbeamer@ucsc.edu

## [CSE 293](https://classes.soe.ucsc.edu/cse293/Winter22/)

## Imagine What _Ideal_ Code Looks Like

* _**Correct**_ - does the right thing every time
  * If inputs are infeasible, it let's you know clearly
  * Not only is correct, but you are convinced it is correct
* _**Easy to work with or understand**_ - documented
  * Also code itself is easy to read or modify
* _**Efficient**_ - not much room left to further improve performance
* It is nearly impossible to accomplish all of these on first draft of code, so you will need to _**REVISE**_ and improve
    * Today's lecture will cover some tools & techniques to help with this process

## Plan for Today

* Continuous integration
* Code management
* Documentation
* Open-source

## Continuous Integration (CI)

* _Motivation:_ in fast churn of development, bugs will creep into project
  * May have been there from beginning, and only surface later
  * May be an issue of how components interact

* _Solution:_ use shared resources to run more tests automatically (CI)
  * _Example:_ automatically running tests for every commit or pull request

* Frequent testing can catch bugs earlier and with less human effort

* Testing is useful beyond the project's internal development
  * Having CI (and making it publicly visible) lets others see you are testing
  * Can be used to screen/sanity check contributions (internal or external)
  * Dependence creators can see if their changes break your (downstream) project

## What is Needed to Set Up CI?

### Tests!
* Should have tests anway, but CI makes good tests even more useful/valuable
* Worried something could pass your tests and be buggy? _Increase your test coverage_

### Scripts/Automation
* Lots of great resources & tools available
* Beyond writing tests, this is most of the effort for setting up CI

### Execution environment
* Can run locally or in the cloud
* GitHub currently provides easy/free setup with [Actions](https://github.com/features/actions)

## Consider More Types of Testing (for CI)

* _Unit_ - tests a module or component in isolation, a key building block for a test suite
* _Integration_ - combines multiple (or all) components and ensures to they work together
* _Regression_ - ensure things you thought worked still work
  * Maybe try out with older compilers or versions of your dependences
* _Smoke_ - subset of tests that check core/critical functionality
  * If one these fail, it is definitely broken
* _Performance_ - ensure no regressions in PPA
* Example of "make tools do the work"
  * Easier/cheaper to have servers running tests than humans debugging
  * CI may run more extensive tests than your run normally during local development
  * Depending on when CI is run, may choose to run different types of tests

## Code Management

* Using version control (e.g. `git`) is essential
  * Can track changes over time, have alternate versions, easily allow collaboration

* What goes in which (git) repository?
  * Git _submodules_ allow you to pull in another git repo at a specific commit
  * Recommend using fewer (or just 1) repo and only puting things in separate repos if intended for independent use cases
  * Best used for tracking an external unreleased dependence

* Recommend keeping only a few long-lived git _branches_
  * Maintaining too many branches can become a burden

* _Pull Requests_ are great way to methodically merge things in
  * Choose when to accept a contribution
  * Can give feedback (review) and revise before accepting

## Code Review Motivation

* If you think about, much of the code you have written previously is like a rough draft
  * Once it worked and your were "done", did you go back to clean it up much?
  * How much did you clean it up?
  * If you did try to clean it up, what did you try to improve?

* Contrast that experience with writing text (you hope is good)
  * Writing a rough draft is only the beginning
  * Even if "content" is there, will revise/rewrite just to improve _readability_ & _clarity_ 
  * Solicit feedback from others and act on it

## Code Review Summary

* _Code Reviews_ use the help of others to improve your code
  * Others may be able to spot weaknesses or issues you didn't think of
  * At a minimum, they will help make it more consistent and clear
    * Code is read far more times it is written (even by you), so code readability matters

* Benefits
  * Reviewed code tends to be much better, or at least consistent
  * Preparing for code review motivates contributor to revise more in advance
  * Looking at, discussing, and receiving coding feedback makes you a better developer

## Code Review Process

* **1** - Typically creator starts process by requesting someone review a code contribution
  * Some projects have policies that make this an explicit requirement
* **2** - Reviewer reviews code and makes suggestions / requests
  * Can use software to easily annotate code for points of discussion
* **3** - Submitter revises and submits for another review (may take multiple rounds)
* **4** - Reviewer approves code and contribution is accepted

## What to Examine in a Code Review

* A project or organization may have their own policies and checklists, but consider the following...
* _**Correctness**_ - are there potential issues?
  * Passing tests & CI is the bare minimum, but are there other situations to consider?
* _**Readability/Code style**_ - is the code clear and does it follow conventions?
  * Can you suggest different code organization/naming that would make things more clear or easier to modify?
  * Can use tools like [scalastyle](http://www.scalastyle.org) to check some style issues (_linter_)
* _**Completeness**_ - does it include sufficient tests & documentation?
* Other factors? Is it appropriate or going to create new issues?

## Documentation Motivation

* Documentation serves multiple purposes
  * _Summarizes_ what the project does
  * _Instructions_ for how to use it
  * _Details_ internal structure and functionality

* Lacking documentation most harms potential users and contributors

* Good documentation benefits a project in multiple ways
  * Encourages users and contributors
  * Forces creators to think about their project from outside user's persepective
    * May cause beneficial rethinking of features or their interfaces

## Relevant Documentation Tools

* `README` file in repository - bare minimum
  - Great for small projects, as it is simple to write and maintain
  - GitHub automatically renders it on code page if written in Markdown

* [Scaladoc](https://docs.scala-lang.org/overviews/scaladoc/for-library-authors.html) allows you to directly document the code - great for APIs
  - Add comments to code with special annotations, and tool generates pretty HTML
  - Some IDEs can render/interact with Scaladoc inline
  - Docs are next to code, so easier to keep in sync

* Static site - great for topic-oriented explanations
  - [readthedocs](https://readthedocs.org) is a helpful site/service for writing/hosting docs

## Documentation Writing Advice

* Be sure to include brief summary of overall function/purpose

* Emphasize what it does (_purpose_) over how it works internally (_implementation_)

* Emphasize how to use thing (_iteraction_) over trying to introduce many abstractions

* Do you find something really hard to explain?
  * Might be an indication the feature/issue/API is worth reconsidering

## Why Should You Open-Source Your Work?

* Can help world with your work
  * Researchers should be trying to help
  * Even in a company, if code is not key advantage, may still be able to release

* Community can improve your code
  * External contributions can add functionality, fix bugs, improve performance

* Can raise your profile
  * Can expose your contributions beyond your organization
  * People notice/respect creators of commonly used things

* Why not?
  * You already benefited greatly from open-source, so return the favor
  * Is there a patent or business that would be infringed/harmed by releasing?
    * If no, why not release?

## Ingredients for Successful Open-Source Project

* Does something _useful_
* Works _correctly_ (need testing)
* _Documented_
* Some _publicity_
* Available with _open-source_ license

## Open-Source Licensing

* When you create something novel (e.g. code), you automatically get a _copyright_ to it
    * Restricts what others can do with it, even if posted publicly online
    * _Note:_ if created in the scope of your employment, your employer may get the copyright
* A _license_ grants people permission to use your thing under certain conditions
  * Different licenses have different permissions and restrictions
* When releasing open-source software, you need include a license to give others permissions to use it
  * The license specifies/clarifies what is covered and what is permitted

## Main Features of Open-Source Licenses

* Most open-source licenses allow use & modification, but there are details...

* Are you allowed to use it for commercial purposes?

* Are you required to distribute changes? Can you include it in a bigger project? (_copyleft_ vs. _permissive_)

* Are you allowed to use trademarks? (_branding_)

* Are you allowed to use patented functionalities within the code? (_patent grant_)

## Common Licenses

* _**BSD & MIT**_
  - commonly used, especially for academic projects
  - _permissive_, and works well for academic, industrial, and personal users

* _**GPL**_ (v3)
  - commonly used, especially for big community projects
  - _copyleft_, so some companies have restrictions about using code with it (even if only for dependence)

* _**Apache**_ (v2)
  - _permissive_ like BSD & MIT, but includes _patent grant_
  - reduces risk for large companies to use code with this license

* _**Unlicense & WTFPL**_
  - very permissive, essentially giving away to _public domain_

## Where to Put Your License?

* Somewhere **prominent!** (e.g. `LICENSE` in root directory)

* GitHub searches typical places and tries to automatically detect which license is used
  * Not perfect, sometimes have to tweak it to be recognized
  * Can also use their web UI to generate a LICENSE file

* [The Software Package Data Exchange (SPDX)](https://spdx.dev)
  * Standard to make it easier for humans/computers to recognize licenses for code

## Attracting Contributions to your Project

* Project needs to be _interesting/useful_
* Adequate _testing_ is necessary
  - for both you and them, be sure thing works
* _Documentation_
* _Responsive_ to _community_
  * Are there old Issues or Pull Requests left unaddressed?
  * Are there resources (mailing list, gitter, StackOverflow tag) to get help?
* Explicitly suggest things they can with suggested projects

### Things (from today) to try out in your project

* CI
* Code reviews
* Documentation
* Releasing (as open-source)

## Links from Demo Inspecting [chisel3](https://github.com/chipsalliance/chisel3)

* CI
  - dated [circleci dashboard](https://app.circleci.com/pipelines/github/chipsalliance/chisel3?branch=master)
  - GitHub Actions [configuration](https://github.com/chipsalliance/chisel3/blob/master/.github/workflows/test.yml) and [dashboard](https://github.com/chipsalliance/chisel3/actions)
* Code Review (inside a [pull request](https://github.com/chipsalliance/chisel3/pull/2353))
* Documentation
  - scaladoc [source](https://github.com/chipsalliance/chisel3/blob/v3.5.0/src/main/scala/chisel3/util/Arbiter.scala) and [result](https://www.chisel-lang.org/api/latest/chisel3/util/Arbiter.html)
  - static site [source](https://raw.githubusercontent.com/chipsalliance/chisel3/master/docs/src/explanations/data-types.md) and [result](https://www.chisel-lang.org/chisel3/docs/explanations/data-types.html)
* [License](https://github.com/chipsalliance/chisel3/blob/master/LICENSE) 