diff --git a/why-postgres.qmd b/why-postgres.qmd index 54af787..8a8aa0a 100644 --- a/why-postgres.qmd +++ b/why-postgres.qmd @@ -4,8 +4,7 @@ description: | SQL is the backbone of any database system. However there are many variants of SQL. This decision post contains the reasons for using PostgreSQL, which is a powerful and feature-full variant of SQL. -author: Kristiane Beicher -date: last-modified +date: "2024-01-05" categories: - backend - database @@ -37,7 +36,7 @@ One of the most important functions of Seedcase is to handle data, and the most [MySQL](www.mysql.com) was first released in 1995 and is maintained by Oracle Corp. It is an open source platform with the option to deploy either as a local server solution or cloud based. The implementation languages are C and C++, and it runs of a variety of operating systems. The system allows access through standard technologies (ADO.NET, JDBC, ODBC, and native APIs). ::: columns -::: {.column style="font-size: 90%"} +::: {.column} #### Benefits @@ -50,7 +49,7 @@ One of the most important functions of Seedcase is to handle data, and the most * There are a number of ways for MySQL to interact with Apache Parquet files. ::: -::: {.column style="font-size: 90%"} +::: {.column} #### Drawbacks @@ -66,7 +65,7 @@ One of the most important functions of Seedcase is to handle data, and the most [PostgreSQL](www.postgresql.org) was first released in 1989 from UC Berkeley and is maintained by the PostgreSQL Development Group. It is an open source platform with the option to deploy either as a local server solution or cloud based. The implementation language is C, and it runs of a variety of operating systems. The system allows access through standard technologies (ADO.NET, JDBC, ODBC, a native C library, and streaming APIs). ::: columns -::: {.column style="font-size: 90%"} +::: {.column} #### Benefits @@ -81,7 +80,7 @@ One of the most important functions of Seedcase is to handle data, and the most * It is possible to create columnar based tables directly in PostgreSQL. ::: -::: {.column style="font-size: 90%"} +::: {.column} #### Drawbacks @@ -95,7 +94,7 @@ One of the most important functions of Seedcase is to handle data, and the most First released in 2000, SQLite is slightly different to the two systems described above, as it is an embedded serverless database primarily maintained by an international team of programmers (see [About SQLite](https://www.sqlite.org/about.html)). It is an open source platform with the option to deploy either locally or in the cloud. The implementation language is C, and it is platform independent. The system allows access through standard technologies (ADO.NET, JDBC, and ODBC). ::: columns -::: {.column style="font-size: 90%"} +::: {.column} #### Benefits @@ -106,7 +105,7 @@ First released in 2000, SQLite is slightly different to the two systems describe * There is always a risk that an open source community will break apart and leave a product unsupported, but the risk here looks minimal. The explicitly stated intention from the core developers of SQLite is to support the product until at least 2050. ::: -::: {.column style="font-size: 90%"} +::: {.column} #### Drawbacks @@ -121,7 +120,7 @@ First released in 2000, SQLite is slightly different to the two systems describe ## Decision Outcome -We've decided to work with PostgreSQL as our backend database as it fulfills all our needs and is a very popular open source tool. MySQL would be the other obvious choice, the application does everything that Seedcase needs, but the user community for PostgreSQL seems to be a bit more active. SQLite is quite popular within the application developer community, but it doesn't have a reliable multi-user functionality, so it may be an uphill battle to get it to do the things we are hoping to do with Seedcase. +We've decided to work with PostgreSQL as our backend database as it fulfils all our needs and is a very popular open source tool. MySQL would be the other obvious choice, the application does everything that Seedcase needs, but the user community for PostgreSQL seems to be a bit more active. SQLite is quite popular within the application developer community, but it doesn't have a reliable multi-user functionality, so it may be an uphill battle to get it to do the things we are hoping to do with Seedcase. ### Consequences diff --git a/why-python.qmd b/why-python.qmd index c6a4cb8..9bc2b38 100644 --- a/why-python.qmd +++ b/why-python.qmd @@ -4,57 +4,118 @@ description: | Python is one of the most common and widely used programming languages. It is used across multiple domains and industries, which means more people would be familiar with using it. -author: "Richard Ding" date: "2023-03-22" -date-modified: last-modified categories: - programming - development - software-architecture --- - +::: content-hidden +Use other decision posts as inspiration to writing these. +Leave the content-hidden sections in the text for future reference. +::: -## Introduction +## Context and Problem Statement -Since Seedcase is a data management system and software, it requires a -programming language for its development that can handle large amounts +::: content-hidden +State the context and some background on the issue, then write a +statement in the form of a question for the problem. +::: + +One of the first things to do when deciding to write a software application is to decide on the programming language. There are several languages that can be used, among them C++, Java, Python, and R. In the context of Seedcase it is important to chose a language that can handle large amounts of data, provide efficient data processing capabilities, and integrate well with other technologies commonly used in the research area. +> Which programming language should we use for developing the Seedcase application? + +## Decision Drivers + +::: content-hidden +List some reasons for why we need to make this decision and what things +have arisen that impact work. +::: + +In the context of Seedcase it is important to chose a language that can handle large amounts of data, provide efficient data processing capabilities, and integrate well with other technologies commonly used in the research area. There is also a consideration with regards to the skills already available in the core team, as we would like to minimize the amount of time that we will need to use in order to be able to program the application. + ## Considered Options -We considered [Python](https://www.python.org), [Java](https://www.java.com/en/), [C++](https://cplusplus.com), and [R](https://www.r-project.org). +::: content-hidden +List and describe some of the options, as well as some of the benefits and +drawbacks for each option. +::: + +### C++ + +::: {.columns} +::: {.column} +#### Benefits + +- Item 1 +::: +::: {.column} +#### Drawbacks + +- Item 1 +::: +::: + +### Java + +::: {.columns} +::: {.column} +#### Benefits + +- Item 1 +::: +::: {.column} +#### Drawbacks + +- Item 1 +::: +::: + +### Python + +::: {.columns} +::: {.column} +#### Benefits + +- Item 1 +::: +::: {.column} +#### Drawbacks + +- Item 1 +::: +::: + +### R + +::: {.columns} +::: {.column} +#### Benefits + +- Item 1 +::: +::: {.column} +#### Drawbacks + +- Item 1 +::: +::: ## Decision Outcome -We have decided to use Python as the main development language for the -following reasons: - -- Is widely used in the research area, particularly in data science - and machine learning, and has a rich ecosystem of libraries and - tools for data processing and analysis. -- It's syntax is concise and easy to read, making it ideal for rapid - development and prototyping. -- Has a large community of developers who contribute to its - development, ensuring that it is constantly evolving and improving. -- Has strong support for web development, with a number of popular - frameworks such as [Django](https://www.djangoproject.com) and [Flask](https://flask.palletsprojects.com/en/2.3.x/), making it easy to build RESTful - APIs for Seedcase. -- Has excellent support for working with databases, with libraries - such as [SQLAlchemy](https://www.sqlalchemy.org) and Django ORM, making it easy to manage and - query large datasets. -- Is a cross-platform language, making it easy to deploy the system on - a variety of operating systems and hardware. - -While Java and C++ are also capable languages for building data -management systems, they are generally more complex and have a steeper -learning curve than Python. R is a powerful language for data analysis -and visualization, but it is less suitable for building large-scale web -applications. - -## Conclusion - -Python is the most suitable option for this project, as it provides a -powerful, flexible, and easy-to-use platform for building a data -management system. Python is also one of the most common and widely used programming languages and is used across multiple domains and industries. +::: content-hidden +What decision was made, use the form "We decided on CHOICE because of +REASONS." +::: + + + +### Consequences + +::: content-hidden +List some potential consequences of this decision. +::: diff --git a/why-ruff.qmd b/why-ruff.qmd index 217beeb..f4d036b 100644 --- a/why-ruff.qmd +++ b/why-ruff.qmd @@ -4,10 +4,7 @@ description: | Enforcing style of code with automatic linters and formatters is important for code reviews to focus on content, not style. This post covers the reasons why we decided on Ruff for our linting and formatting purposes. -author: - - "Richard Ding" - - "Luke Johnston" -date: 2023-11-27 +date: "2023-11-27" categories: - contributing - culture @@ -24,7 +21,7 @@ categories: ## Context and Problem Statement -Humans are prone to error when writing, whether it is code or text. In a team setting, more people working on the same things increase the chance of more issues occuring. And writing code is not done for the computer, but for other humans to read, so readability and consistency in style become important when reviewing that code. So our problem is: +Humans are prone to error when writing, whether it is code or text. In a team setting, more people working on the same things increase the chance of more issues occurring. And writing code is not done for the computer, but for other humans to read, so readability and consistency in style become important when reviewing that code. So our problem is: > How do we enforce a consistent style across people and code? And how do we catch simple errors that happen because of the style or format of the code? @@ -38,7 +35,7 @@ Humans are prone to error when writing, whether it is code or text. In a team se The terms "linting" or "formatting" are used to describe scanning, analysing, and (potentially) fixing code for style and typographical issues. The important difference between linting and formatting is that linting only tells you about the issues while formatting will fix (many of) the issues. Some issues can't be solved from formatting alone, so both linting and formatting are often used together. -There are many tools available for Python, with many websites that have detailed comparisons of them (like [this](https://realpython.com/python-code-quality/), [this](https://geekflare.com/python-linter-platforms/), or [this](https://github.com/caramelomartins/awesome-linters#python) website). Based on this list and based on quick searchs on Google, these are the tools that come up the most often: +There are many tools available for Python, with many websites that have detailed comparisons of them (like [this](https://realpython.com/python-code-quality/), [this](https://geekflare.com/python-linter-platforms/), or [this](https://github.com/caramelomartins/awesome-linters#python) website). Based on this list and based on quick searches on Google, these are the tools that come up the most often: - [Pylint](https://github.com/pylint-dev/pylint) - [Flake8](https://github.com/PyCQA/flake8) @@ -49,45 +46,72 @@ Below is a detailed description of the pros and cons based on what others have w ### Pylint -- Pros: +::: {.columns} +::: {.column} +#### Benefits + - Very old, well-established linter - Large community of users and contributors - Very comprehensive list of checks - Highly configurable - Is integrated into many other tools (like flake8, black, and ruff) - Linting feedback is extensive -- Cons: - - Too much configuration needed. - - Slow to run. - - Is often not needed to use on it's own because it is integrated with other tools. - - Linting feedback is extensive and a bit overwhelming. +::: +::: {.column} +#### Drawbacks + + - Too much configuration needed + - Slow to run + - Is often not needed to use on it's own because it is integrated with other tools + - Linting feedback is extensive and a bit overwhelming +::: +::: ### Flake8 -- Pros: +::: {.columns} +::: {.column} +#### Benefits + - Extensive list of checks - Includes many other linters - Often used with formatters like Black - Customizable - - Large userbase and community + - Large user base and community - Can use plugins to expand functionality -- Cons: +::: +::: {.column} +#### Drawbacks + - Only lints and doesn't format - Is integrated into newer tools (like Ruff), so might not need to be used on it's own +::: +::: ### Black -- Pros: - - Is a code formatter, not linter. - - Opinionated set of rules for code formatting, so removes need to configure things. +::: {.columns} +::: {.column} +#### Benefits + + - Is a code formatter, not linter + - Opinionated set of rules for code formatting, so removes need to configure things - Recommend to use with a linter (often suggested to use flake8 or pylint) -- Cons: +::: +::: {.column} +#### Drawbacks + - Difficult to configure customizations - Integrated/compatible with newer tools (like Ruff) +::: +::: ### Ruff -- Pros: +::: {.columns} +::: {.column} +#### Benefits + - Very fast - Implements almost all of Black and flake8 features - Implements many other features from other code analysis and checking tools @@ -95,9 +119,14 @@ Below is a detailed description of the pros and cons based on what others have w - Newer and has more modern development - Configuration is available and relatively straightforward to use - Can be implemented alongside other tools -- Cons: +::: +::: {.column} +#### Drawbacks + - Is still new, so bugs and other features are still being developed - Does not yet have all of pylint features implemented +::: +::: ## Decision Outcome @@ -105,5 +134,5 @@ We decided on Ruff because it is a newer tool that implements many of the other ## Potential Consequences -- We might miss out on some features from pylint (since right now we won't include pylint). -- There may be some bugs along the way because Ruff is relatively new, though this can be minimized by relying on more stable versions of it. +- We might miss out on some features from pylint (since right now we won't include pylint) +- There may be some bugs along the way because Ruff is relatively new, though this can be minimized by relying on more stable versions of it diff --git a/why-standard-shortcuts.qmd b/why-standard-shortcuts.qmd index 2bd3f39..307244f 100644 --- a/why-standard-shortcuts.qmd +++ b/why-standard-shortcuts.qmd @@ -1,8 +1,7 @@ --- title: "Why standardized snippets" description: "The larger a project is, the more important it becomes to have a joint set of standards when writing documentation. We decided to set up shortcuts that are shared across the team, so that all documentation follows the same classification and formatting." -author: "Kristiane Beicher" -date: last-modified +date: "2023-11-23" categories: - code snippets - communication @@ -29,8 +28,6 @@ As the documentation for Seedcase growing, and we have reached a level where it We also need to find a way to ensure the consistent use of keywords, so that when a reader clicks a `tag` in a document they get all relevant pages, and don't miss any due to the fact that half are tagged in one way (eg. `database`) and the other half is tagged slightly differently (eg `databases`). - - ## Considered Options We have so far looked at two ways of streamlining the writing of documentation through the use of code snippets and shared keywords, which can be set using the same settings file. There aren't many "generic" methods to share code snippets across IDE's (e.g. between RStudio or PyCharm), so we only investigated ways of adding these in VS Code. diff --git a/why-text-based.qmd b/why-text-based.qmd index 046fb32..fb3a69b 100644 --- a/why-text-based.qmd +++ b/why-text-based.qmd @@ -1,16 +1,14 @@ --- title: "Why text/code-based tools?" description: "Our reasons for selecting tools that are based on code, rather than GUIs." -author: "Luke W. Johnston" -date: last-modified +date: "" draft: true categories: - documentation - code - text --- - -TODO: Need to fill this in, later. + "Everything as code" @@ -27,3 +25,56 @@ https://octopus.com/blog/what-is-everything-as-code https://medium.com/geekculture/code-as-diagrams-whats-the-point-13dbe6053738 https://www.cloudbolt.io/blog/3-advantages-and-challenges-of-infrastructure-as-code-iac/ + +::: content-hidden +Use other decision posts as inspiration to writing these. +::: + +## Context and Problem Statement + +::: content-hidden +State the context and some background on the issue, then write a +statement in the form of a question for the problem. +::: + +## Decision Drivers + +::: content-hidden +List some reasons for why we need to make this decision and what things +have arisen that impact work. +::: + +## Considered Options + +::: content-hidden +List and describe some of the options, as well as some of the benefits and +drawbacks for each option. +::: + +### Option 1 + +::: {.columns} +::: {.column} +#### Benefits + +- Item 1 +::: +::: {.column} +#### Drawbacks + +- Item 1 +::: +::: + +## Decision Outcome + +::: content-hidden +What decision was made, use the form "We decided on CHOICE because of +REASONS." +::: + +### Consequences + +::: content-hidden +List some potential consequences of this decision. +:::