parsedate.py is a Python module to process various human input dates or date ranges into a uniformed format.
Developed for a code4lib mailing list request.
Take a look at the original data in testData.csv. After processing, it becomes Processed_testData.csv.
You can also change the date delimiter and date range delimiter in the output to anything you want when you use the module.
input | output |
---|---|
1947 | 1947 |
August 1947 | 1947-08 |
August 3, 1947 | 1947-08-03 |
August 3-7, 1947 | 1947-08-03/1947-08-07 |
July 24, 1914 - January 30, 1915 | 1914-07-24/1915-01-30 |
May 23, 1957-June 20, 1957 | 1957-05-23/1957-06-20 |
1947 (August) | 1947-08 |
1947 (August 3) | 1947-08-03 |
1947 (August 3-7) | 1947-08-03/1947-08-07 |
May 14 (?) | ERROR |
1917? | 1917 |
May 14, ____ | ERROR |
ca. 1947 | 1947 |
ca. 1971-1972 | 1971/1972 |
ca. 1980s | 1980/1989 |
circa 1947 | 1947 |
circa 1939-1940 | 1939/1940 |
1944 (April - May) | 1944-04/1944-05 |
1939 (November) - 1940 (August) | 1939-11/1940-08 |
1955 (Jan.-June) | 1955-01/1955-06 |
1939 (November 6) - 1940 (August 7) | 1939-11-16/1940-08-07 |
June-December 1983 | 1983-06/1983-12 |
August 24 1988; October 31, 1988 | 1988-08-24/1988-10-31 |
Winter 1985-1986 | 1985/1986 |
1986- | 1986 |
through 1983 | ERROR |
thru 198 | ERROR |
1933, 1937-1938, 1941 | 1933/1941 |
1897, 1906 | 1897/1906 |
pre-1975 | ERROR |
pre-1975 (May) | ERROR |
1965-1975, n.d. | 1965/1975 |
undated | |
n.d. | |
1932, 1940s-1975, n.d. | 1932/1975 |
1960s | 1960/1969 |
1930s-1950s | 1930/1959 |
1954 and undated | 1954 |
5/9/1970 | 1970-05-09 |
Saturday, 9 May 1970 | 1970-05-09 |
20 Jan 1973 | 1973-01-20 |
1944-1950 [died Aug. 1949] | ERROR |
1967-onward | 1967 |
January 27, 1975 [1974?] | ERROR |
re: 1906 | 1906 |
Easter 1961 | 1961 |
May 31, 1964-Fall 1965 | 1964-05-31/1965 |
June 2 - ____, 1971 | ERROR |
n.d.; May 26, 1976 | 1976-05-26 |
May 1973 - Jul7 1973 | 1973-05/1973-07-07 |
May 1973-July 1973 | 1973-05/1973-07 |
Download or clone this github repository.
Save your orginal data file into the same folder as test.py and parsedate.py. The data file does not need to be a .csv, it could be a plain text file as long as each entry starts at a new line.
In test.py, replace the input file name with the name of your data file, then run:
python test.py
You can also change the date delimiter and date range delimiter to anything you want when using the parse() method.