Skip to content

std::string_view: A Nice New Friend

amirroth edited this page Sep 20, 2021 · 1 revision

C++17 includes a fun new standard library template, std::string_view. std::string_view is essentially the C++ equivalent of C's const char *, a pointer to a constant array of characters. Because this is C++, however, std::string_view also stores the length of the string and supports bounds checked access, iterators, and other helper functions that are associated with std::string.

Unlike std::string, std::string_view does not "own" the memory that holds the string itself. Instead, it simply points to memory that belongs to another std::string object or simply to memory that is known to contain a string. The latter is very useful for literal strings. What is the difference between these two variables?

std::string programName1 = "EnergyPlus"; // Points to a heap copy of the "EnergyPlus" string from the .cstring segment in the binary.
std::string_view programName2 = "EnergyPlus"; // Points directly to the "EnergyPlus" string in .cstring segment, no copy.

To understand the difference, we first have to understand where the literal string "EnergyPlus" actually lives in the program. It lives in a section of the binary called .cstring. If you disassemble EnergyPlus and dump out the .cstring segment, you will see all literal strings in there. When EnergyPlus loads, the .cstring segment is loaded into memory alongside the .text (i.e., code segment) and several other segments corresponding to global data. Anyway, the programName1 object internally points to a heap-allocated array that contains a copy of the string "EnergyPlus". The programName2 object simply points to the "EnergyPlus" string in the .cstring segment, there is no heap copy and no allocation/initialization overhead. And because there is no heap allocation or construction, a std::string_view object can be made constexpr, i.e., a compile time constant, which further reduces overhead. (Read about constexpr here).

When should you use std::string_view vs. std::string? Well, if you need a string that you need to manipulate at runtime, e.g., you need to set the value of the string or modify it in some other way, then you need actual heap storage which means you need a std::string. If you do not need to modify the contents of the string, i.e., you are using a std::string const, you should use a std::string_view instead. For example:

constexpr std::array<std::string_view, 7> dayNames = {"Sun", "Mon", "Tue", "Wed", "Thu", "Fri", "Sat"};

Also, if you have a function with std::string const & parameter, you should replace that parameter with a std::string_view. This will allow you to pass literal strings to the function without copying them.

int 
getDayFromName(std::string_view dayName)
{
   for (int i = 0; i < dayNames.size(); ++i) 
      if (dayName == dayNames[i])
         return i;
   return -1;
} 

Notice, the std::string_view dow argument is passed to the function by value. Shouldn't it be passed by const & just like a std::string argument would be? Well, there is some debate about this but the StackOverflow consensus is that std::string_view is an exception to the rule, because i) it has only two elements and it is generally slightly cheaper to pass two elements by value than to pass a reference to the structure that contains the two elements; because ii) a std::string_view is itself essentially a reference and passing a reference by reference is considered uncool; and because iii) you should pass parameters by value when you can.

One issue with std::string_view is that because it does not have internal storage, it does not support the + (i.e., concatenate) operator, so things like this are not allowed.

cout << programName2 + " : an open-source BEM engine from DOE and the national labs."; // This will not compile.  

There are two ways around this. One is to construct a std::string on the fly.

cout << std::string{programName2} + " : an open-source BEM engine from DOE and the national labs.";   

But a better and more general one is to use the fmt::format function. This is especially true if you have to concatenate multiple strings together. fmt::format will allocate only a single string object rather than multiple objects which then also have to be reallocated. [Ed: the whole overloading of + to mean string concatenate and << to mean write to stream is too cute by half and should not be used.]

cout << fmt::format("{} : an open-source BEM engine from DOE and the national labs.", programName2);   

One transient issue with std::string_view is that some libraries still use std::string const & parameters and have not yet been updated to use std::string_view parameters. This will force you to use the std::string{} construction when passing arguments to functions in those libraries. That is annoying, but hopefully only temporary and will go away as libraries get updated. In the meantime, use std::string_view as much as possible!

Clone this wiki locally