Skip to content

Commit

Permalink
Character Literals
Browse files Browse the repository at this point in the history
  • Loading branch information
cabmeurer committed Aug 9, 2022
1 parent 21d39ef commit 2660f2f
Showing 1 changed file with 97 additions and 0 deletions.
97 changes: 97 additions & 0 deletions proposals/p1964.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,97 @@
# Character literals

<!--
Part of the Carbon Language project, under the Apache License v2.0 with LLVM
Exceptions. See /LICENSE for license information.
SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
-->

[Pull request](https://github.com/carbon-language/carbon-lang/pull/1964)

<!-- toc -->

## Table of contents

- [Problem](#problem)
- [Background](#background)
- [Proposal](#proposal)
- [Details](#details)
- [Rationale](#rationale)
- [Alternatives considered](#alternatives-considered)

<!-- tocstop -->

## Problem

This proposal specifies lexical rules for constant characters in Carbon.

## Background

We wish to provide a distinct lexical syntax for character literals versus
string literals.

In theory we could just reuse string literals for the purpose of character
literals. However, it could benefit the readablity of our code if we had a
distinct lexical syntax for character literals versus string literals.

## Proposal

The idea is to create and manage a character literal the same we would as a
string, but using the single quote (') compared to the string double quote (").

As with string literals, each character literal would have a different type.

var Character: w = 'w';

We will not support:

- Multi-line literals
- "raw" literals (using #'x'#)
- Empty character literals (''')

## Details

A character literal is a sequence enclosed with single quotes ('), exluding:

- New line
- Single quote (')
- Back-slash (\)
- Escape sequences

The type of a character literal will depend on the the contents, so that 'c' and
'b' would have different types (as would 'b' and "b"). However any '\n' and
'\u{A}' would be of the same type (As when they are encoded, they are the same
unicode entities %0A).

No restriction is placed on the number of UTF-8 code units in a character
literal, but conversions from character literal type to each kind of character
type would be supported only for characters that are representable, in the same
way as we restrict integer literals to only convert to integer types in which
they are representable.

## Rationale

TODO: How does this proposal effectively advance Carbon's goals? Rather than
re-stating the full motivation, this should connect that motivation back to
Carbon's stated goals and principles. This may evolve during review. Use links
to appropriate sections of [`/docs/project/goals.md`](/docs/project/goals.md),
and/or to documents in [`/docs/project/principles`](/docs/project/principles).
For example:

- [Community and culture](/docs/project/goals.md#community-and-culture)
- [Language tools and ecosystem](/docs/project/goals.md#language-tools-and-ecosystem)
- [Performance-critical software](/docs/project/goals.md#performance-critical-software)
- [Software and language evolution](/docs/project/goals.md#software-and-language-evolution)
- [Code that is easy to read, understand, and write](/docs/project/goals.md#code-that-is-easy-to-read-understand-and-write)
- [Practical safety and testing mechanisms](/docs/project/goals.md#practical-safety-and-testing-mechanisms)
- [Fast and scalable development](/docs/project/goals.md#fast-and-scalable-development)
- [Modern OS platforms, hardware architectures, and environments](/docs/project/goals.md#modern-os-platforms-hardware-architectures-and-environments)
- [Interoperability with and migration from existing C++ code](/docs/project/goals.md#interoperability-with-and-migration-from-existing-c-code)

## Alternatives considered

TODO: What alternative solutions have you considered?

```
```

0 comments on commit 2660f2f

Please sign in to comment.