Skip to content
Word Frequency Count
C Shell Makefile
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.

Files

Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
.gitignore
LICENSE.txt
Makefile
README.md
build.sh
main.c

README.md

Word Frequency Count

This small C program counts the frequency of all words in a text file. Usage (you might have to adjust PYTHON_PREFIX in build.sh before running this):

$ make
$ ./wfc <somefile>

Using Python as a C library

Ilan Schnell - May 2019

In this article, we want to show something unusual: how to write a simple C program that uses only the Python C library.

The Python/C API is very well documented and offers access to all of Python's data types and their functionality, such as list, dict, etc. . Python provides this C API in order to allow users to write Python C extension modules, which basically behave like Python modules but are writen in C rather than in Python.

These C extensions are very important when interfacing Python with existing C libraries and when writing speed- or memory-critical libraries. The Python standard library itself contains a large number of C extension modules, which are an excellent resource for exploring how to write C extensions. When I wrote bitarray in 2008, I found studying the array standard library extremely useful.

Now we want to write a pure C program that uses some of the functionalities exposed in the Python/C API, such as Python's data structures and algorithms, but is not a Python C extension. Instead this C program has its own main() function and only uses the Python dictionary implementation to count word frequencies in a text file. The task of reading a text file and recognizing words separated by whitespace is not too hard to accompish in C. Whenever a word is read, it is added to a dictionary, which maps words according to their frequency count. This function looks like this:

void add_word(PyObject *dict, char *str)
{
    PyObject *value;
    long cnt;

    value = PyDict_GetItemString(dict, str);
    cnt = (value == NULL) ? 0 : PyLong_AsLong(value);
    cnt++;
    PyDict_SetItemString(dict, str, PyLong_FromLong(cnt));
}

We need to include Python.h and a link to libpython to make this work. The entire program, and a build script that works on Linux and MacOS, can be found here.

Obviously, we could have easily written a program for this particular task in pure Python without having to worry about any C code at all. But this is not the point of this exercise. The point is to show how the Python C library can be used directly within a C program (without writing an entire C extension).

I hope you've enjoyed this article, and maybe learned something new and useful.

You can’t perform that action at this time.