Skip to content

Google Summer of Code

Andrew Olsen edited this page Mar 31, 2022 · 14 revisions

Getting Started

Download Kart, follow the tutorial, and start exploring with your own datasets. If you find any bugs, report them, and maybe install the development version of Kart and try a fix? It's a good way to learn how the project works. We'd definitely welcome any suggestions to improve the help or documentation too.

Start communicating with the developers — start a discussion or explore the issue tracker to see the current topics we've been focussing on.

Have a look through the ideas list and see whether any of the proposed ideas interest you.

How to Apply

  1. Read the links and instructions given on this page. We've tried to give you all the information you need to be an awesome GSoC applicant.

  2. Start a discussion and talk with your prospective mentor about what they expect of GSoC applicants. They can help you refine your project idea and get a plan together. Listening to the mentors' recommendations is very important at this stage! As part of your proposal we expect you to research possible solutions, and plan "how" your solution will fit together - we have some general ideas but this is your project & proposal. You can email your mentors privately if you want, contact details are linked below.

  3. Usually we expect GSoC contributors to fix a bug or make an improvement, and have made a pull request to Kart. Your code doesn't have to be accepted and merged, but it does have to be visible to the public and it does have to be your own work.

  4. Write your application (with help from your mentors!) The application template is available here. All applications must go through Google's application system from 4 April; we can't accept any application unless it is submitted there. Make it easy for your mentors to give you feedback. If you're using Google Docs, enable comments and submit a "draft" (we can't see the "final" versions until applications close). If you're using a format that doesn't accept comments, make sure your email is on the document and don't forget to check for feedback!

  5. Submit your application to Google before the deadline (19 April at 1800 UTC). Google does not extend this deadline, so it's best to be prepared early. You can keep editing your application up until the deadline.

    💡 Communication is probably the most important part of the application process. Talk to the mentors and other developers, listen when they give you advice, and demonstrate that you've understood by incorporating their feedback into what you're proposing. If your mentors tell you that a project idea won't work for them, you're probably not going to get accepted unless you change it.

Application Template

An ideal application will contain 5 things:

  1. A descriptive title
  2. Information about you, including full contact information. Which time zone you're in. If you're studying, your institution, course, and year.
  3. Link to a code contribution you have made/proposed to Kart. If you've made some contributions to other open source projects that you're proud of please link to them too.
  4. Information about your proposed project. This should be fairly detailed and include a timeline.
  5. Information about other commitments that might affect your ability to work during the GSoC period. (exams, classes, holidays, other jobs, weddings, etc.) We can work around a lot of things, but it helps to know in advance.

🤍 Some of the content above was originally sourced from the fine folks at Python Summer of Code and their documentation (CC-BY).


Ideas List

This is our current ideas list for 2022. Remember, you're welcome to propose your own idea, but you need to start a discussion with the mentors before submitting so we can all make sure it has the best chance of being accepted. Keep checking back, this list will evolve as we go along and we'll flesh out further details too.

Kart CLI Help Improvements

  • Project Size: Medium (175h)
  • Mentors: @rcoup, @craigds
  • Skills needed: Python

Kart could do better to support CLI users as they interact with the commands and the data in their repositories. Adding tab completion so it works smoothly and consistently for Kart commands and their options would make this better. Then expanding that further so datasets, branches, tags, files, metadata, features, and other context-relevant information is also presented where appropriate. Kart currently doesn't have a means of building or exposing man-style documentation — establishing such a framework the project can build upon would also fit into this project (eg: via kart x --help). This should be cross-platform as much as possible, supporting bash/Zsh/fish/etc as well as PowerShell on Windows. Investigating how Git, aws-cli (and other tools you've used with amazing help systems) approach this problem might be a good starting point for research.

Attachments Support

  • Project Size: Medium (175h)
  • Mentors: @olsen232
  • Skills needed: Python, Git

Kart enables version control of vector or tabular datasets, just as Git enables version control of files and folders. If Kart also supported version control of files and folders, then any files that are relevant to these datasets - perhaps README files, images and thumbnails, documentation PDFs, licenses, metadata XML - could be stored in the same version controlled repository alongside the datasets they refer to. Kart is already capable of storing version controlled files, using a Git object database, but every Kart operation needs to be modified so that it works simultaneously on tabular datasets and on files and folders (ie, the command to commit changes should simultaneously commit tabular changes from a database and file changes from the filesystem). And we need a user experience that doesn't lead to datasets accidentally being committed as attachments!

OGC Features API Support

  • Project Size: Medium (175h) or Large (350h)
  • Mentors: @rcoup, @craigds
  • Skills needed: Python, HTTP APIs

OGC API - Features (OAPIF) is the successor to WFS as the key open standard for people to create, modify and query vector spatial data on the web. We should be able to serve datasets via OAPIF directly from Kart repositories as well as potentially integrate Kart repositories into larger OAPIF server infrastructures like pygeoapi or GeoServer. There are several components for OAPIF support as defined by the "Parts" of the standard: 1:Core; 2:Coordinate Reference Systems; 3:Filtering; and 4:Create, Replace, Update, and Delete. So there are opportunities for exploring this project from some different angles. The project should aim to add robust support for serving read-only OAPIF Core responses directly from Kart repositories, then deeper support for one or more of:

  1. developing plugins for pygeoapi (Python), GeoServer (Java), or other OAPIF servers
  2. additional Parts of the standard. eg: CRS; filtering, optionally with indexing support; writes
  3. support for incorporating & exposing versioning history

Multi-version Spatial Indexing

  • Project Size: Medium (175h) or Large (350h)
  • Mentors: @rcoup, @craigds
  • Skills needed: Python, C/C++

Being able to serve and consume data directly from Kart repositories is a key goal of the project. To that end, being able to very quickly query data spatially is important. Typically an R-tree or one of the variants is used to do this, but this applies to a static dataset — in Kart's case we have versions, and each commit should not require a new index to be queried efficiently. One option for implementing this is to use a multi-version R-tree index (eg: MVR-tree/PPR-tree or others) which can reuse the same index for different commits. The project would aim to select and integrate an appropriate spatial index into Kart (eg: building on libspatialindex or other implementations), including support for updating the index robustly, deal with coordinate reference systems, and implement stable and high performance querying. The spatial indexing will be the basis of other future features of Kart.

Simple Repository Hosting

  • Project Size: Medium (175h)
  • Mentors: @rcoup
  • Skills needed: Python, Docker, Git, Linux administration, Writing

Make it easy for people to host Kart repositories on their own infrastructure by establishing best-practises, tools, and guidelines. Kart datasets can be large, so configuring Git well to host Kart repositories efficiently is important. In addition, Kart supports server-side spatial filtered clones, but this requires indexing when pushes are received. And as repositories are updated, maintenance needs to be run on them to keep them working efficiently. The project would be to make all these pieces work well together, and designing and coding a Docker setup for a Gitolite (or similar) configuration with Kart which can be used to host Kart repositories via SSH or for users to build further on for their own needs.

Add more working copy types

  • Project Size: Medium (175h)
  • Mentors: @olsen232
  • Skills needed: Python, Git

Kart stores database rows internally in a JSON-like format, which it can then check-out into working copies that are database tables in any of the following databases: Geopackage, PostgreSQL, Microsoft SQL Server, or MySQL (with a focus on making sure that geospatial data is supported). This project is to add support for Kart to check-out the same data into a working copy in one of the following formats:

  • Esri Shapefile
  • GeoJSON
  • CSV (Comma separated value)

Implementing one of these types of working copies will involve the three main areas:

  • Firstly, simply adapting all of the data types Kart supports from Kart's native format to and from the working copy format.
  • Secondly, working around the lack of triggers in these filetypes (all existing working copy formats use triggers to track user edits, which makes generating working-copy diffs much more efficient).
  • Thirdly, working around any missing features in the target format, including suppressing diffs that are not user-desired edits but are only due to limitations in the target format.

Of the three areas, the challenges involved in this third area will depend the most on the working copy type chosen - some research upfront to work out what is involved for each type could inform your choice of which working copy type to begin implementing.