Oakland and the Search for the Open City
At the center of the Bay Area lies an urban city struggling with the woes of many old, great cities in the USA, particularly those in the rust belt: disinvestment, white flight, struggling schools, high crime, massive foreclosures, political and government corruption, and scandals. Despite these harsh realities, Oakland was named among the five best places in the world to visit in 2012 by the New York Times, something we were simultaneously excited about and stunned by. Oaklanders are proud of our heritage, our diversity, our legacy of great musicians, great food, and amazing art, and our truly beautiful city by the bay.
We’re not huge like Chicago, New York, or San Francisco—megacities with large populations and a corresponding large city staff. We don’t have a history of prominent leaders in the open government movement. Still, we’re on the bumpy, exciting road that open data lays out. This road has many possible paths for our city—some lead to truly open government, and some lead to only minor improvements and “openwashing” (referring to the practice of publishing a few datasets and suggesting the government has therefore achieved openness and transparency).
Our journey shows why open data matters to a city with big troubles and how something as geeky as public records access supports a positive transformation in our city—for everyone, not just for us geeks.
改變我以及所屬組織對「預設開放」看法的事件，是個突如其來的大轉變。自2006年移民美國後，我很榮幸能在都市策略協會（Urban Strategies Council）工作。這個協會是個社會正義非營利組織，促進城市族群以及衝擊低收入族群（多半為有色族群）相關政策上的平等。
都市策略協會長期以來一直支持地方政府、社區參與以及非營利社群進行資料導向決策。為此，我們一直維持自己的資料倉庫來進行非常行動導向的社會研究與空間分析。我們得協調取得政府資料（代價通常相當昂貴）、簽訂不公開協議、到處爬資料，然後有時很幸運的能在網路上找到。甚至，我們還有個正式目標要讓資料民主化。如同大多數在國家社區指標聯盟 （National Neighborhood Indicator Partnership, NNIP）的夥伴，我們在網路上建立地圖平台，讓政策的決策者、組織者以及社會大眾能藉此接觸到複雜資料簡化後的系統，以支持更廣泛的資料運用。就像其他組織，我們預設大家會時常來要製作各種特製的地圖，所以最好提供工具讓他們自己動手。人們也常會要求原始資料，有時候我們有權發布資料，而有時我們能用工作時間來發布或傳送這些資料給詢問者，但更多時候我們無法提供，就是因為上述兩個原因的其一或是兩者都出了問題。
與灣區自動製圖組織（Bay Area Automated Mapping Association (BAAMA)）的一次常會改變了我對開放資料的看法。寫下我們使用的優秀開放空間資料庫PostGIS的那位加拿大發燒友Paul Ramsey，在演講最後用一張投影片大膽的宣示：你們使用資料的方式，並不一定是最有效的。
在我們推動在所有城市開放資料的落實時，必須記得這件事。我們追求的是以理念來觸及並影響群眾，而不是制度。就像Aaron Swartz (2012) 說過，得先修好機器，而不是人。我們必須要與群眾接觸，產生連結，才能修好我們殘破又封閉的政府。
The Start of Open: The Conviction Phase
The event that changed my thinking and changed my organization toward “open by default” was as unexpected as it was transformative. I’ve had the privilege to work at the Urban Strategies Council since immigrating to the USA in 2006. The Council is a social justice nonprofit that strives to support equity in urban communities and in policies that impact low-income communities, mostly communities of color.
A winding road led me to this exceptional organization. I started out as a land surveyor and planner in the private sector, dabbled in IT consulting in London, then landed in public health, working in spatial epidemiology. In the private sector, I got to interface with government in order to access and submit data ranging from suburban plans to large engineering project data and satellite imagery. Following that, I spent some years in the Western Australian Health Department, where I helped to establish a statewide spatial database to support our workforce and to enable public interfaces into government data. Here, I got to experience the empire building and data possessiveness I’ve railed against in the years since. In this job, I gained firsthand knowledge of what it’s like to create, manage, and be responsible for government data. There I experienced both the desire to control and restrict access to “my data” and the knowledge that this restricted data can do so much more when others have access to it. That job demonstrated a great conflict between securing and managing confidential data and supporting easy access to it.
As I was leaving this role, I was struck by the realization that even after years of dealing with data, people in our department still didn’t know our team existed and was available to serve them. In later years, I’ve realized that this is symptomatic of most government agencies: we do a terrible job of communicating. Our governments are not just struggling to be open and accessible to the public; they also fail to do this well internally.
At the Urban Strategies Council, we have a long history of supporting data-driven decisions in local government, community engagement, and the nonprofit community. In order to do this, we’ve maintained our own data warehouse to allow us to perform very action-oriented social research and spatial analysis. We negotiate access to government data (often paying dearly for the privilege), we sign non-disclosure agreements, we scrape the data, and sometimes, we’re lucky enough to easily find what we’re looking for online. We even have a formal goal to support the democratization of data. Like most of our partners in the National Neighborhood Indicators Partnership (NNIP), we’ve done this through building web mapping platforms that enable policy makers, organizers, and the general public to access complex data in simplified systems in order to support broad use of this data. Like most other organizations, our presumption was that because people always call us asking for custom maps, we needed to give them the tools to make them too. This is a fair response, if slightly disconnected from those others’ reality. Oftentimes, people will ask us for raw data. Sometimes, we have permission to distribute data, and sometimes, we can justify the staff time to publish or send the data to those asking, but often, we cannot deliver due to a combination of those two factors.
A rather ordinary meeting of the Bay Area Automated Mapping Association (BAAMA) triggered my change of heart about open data. A Canadian firebrand named Paul Ramsey, who built a great open source spatial database tool we use, called PostGIS, finished a presentation with a slide that boldly declared: “Your use of data is not necessarily the highest use of those data.”
This one simple statement gave me the conviction to enable others to do good, to understand issues, and to easily find data and leverage it. It struck me that every time we don’t make data openly available, we are limiting some other great improvement from happening. Every time we burn through project funds trying to track down and beg, borrow, or scrape data, we are in fact perpetuating the very thing that we regularly complain about from our government. It was suddenly clear that when we set out to rebuild our mapping and data visualization platform (see http://viewer.infoalamedacounty.org), we had to plan to open our data at the same time. When our new system launched in 2012, we were the first, and I think still the only, system built on an ESRI base that allows users to easily download both our geographical data and the raw data behind the maps. We paired this interface with a cobbled together data portal to help users find our cleaned, value-added raw data too. My reasoning was that if we’d used funders dollars or government contract dollars to acquire, clean, and geocode the data, then we should really be making more use of it than we could by keeping it locked away.
Many of our type of nonprofit or university think tanks face the same issue: we’ve collated incredible amounts of public and private data, yet we really don’t have the funds and staff to take full advantage of it all. I grew increasingly frustrated with this reality; we spent days getting data and doing a single project with it, perhaps reusing it a few times, but the true potential of the data was clearly not being realized. In opening our data, we have seen a change in perception of who we are and a marked increase in visibility. We still struggle to avoid being the gatekeeper—the one with control. Instead, we try to be an enabler, a node on the local network that connects those in need with the people or data they require. This is a rewarding role, but even more rewarding is the shift from being analysts that devote significant time to finding data to analysts who get to think, do more real analysis, and have more impact as we benefit from open data in our region.
I believe that to scale open data broadly across local government, we must rely on government staff and leaders to have a similar moment of conviction as the one I had. We’re not serving our community well by restricting access to data. Just as we have policies that mandate certain records be made public when requested, if the person who manages the data doesn’t like you or your use, then that policy is often ineffective. Government is limited and mandated through policy, but at the end of every request or every new idea, there is a government official with his or her own ideas, struggles, and problems.
In our push to realize open data across all our cities, we must never forget this fact. We are seeking to reach and impact people, not institutions, with our ideas. Yet, as Aaron Swartz (2012) said, we must fix the machine, not the people. We have to reach and connect with the people in order to fix our broken, closed governments.
我之前從來沒在地方府起草過任何一個通過的政策，所以著手進行這項任務時有些慌。好在有紐約與舊金山，他們優秀的政策和指示可以直接從Code for Americ網站上取得，我就直接把它們改成適合奧克蘭（和Alameda郡）的樣貌。之後，我又將大多數開放資料城市的優先資料作統整，並為城市如何通過並施行開放資料的指南寫草稿。我現在有了可重複使用的政策、其他地方展現開放資料實力的優良案例、一些清楚的施行步驟和方向、還有我認為是必殺武器的使用案例作為武器。我該怎麼辦呢？過去我從來沒有作過遊說，但我覺得那就是下一步。
我的在地案例Crimesspotting便展現開放資料如何比政府能提供、負擔與想像的來得多，這是最早也最令人印象深刻的開放資料civic hacking案例之一，更是我所寄望的必殺武器。這個網站是一個好朋友Michal Migurski建的，每晚從FTP伺服器取得一個Excel檔案，再提供一個優雅可用的介面，幫助居民了解最近社區中的犯罪活動。這完全沒花市政府一分錢，所以我用這個故事來說明，如果開放整個城市的資料該有多棒。但我錯了，雖然我和一些市議員取得共鳴，但其他人很討厭這個網站，這讓我的工作變得更棘手。
這代表著開放資料的機會。如果市政府不斷的開放他們的犯罪資料，其他人就能建立良好政策制定需要的介面、工具和報告。我們曾經提供基礎的Excel檔案給市政府作為短期協助，但這個需求為 OpenOakland 重新部屬 CrimeinChicago.org 給奧克蘭使用提供一個使用案例：既符合當地需要，又能提升開放資料展現其實力，和當地領導人取得共鳴。
Oakland’s city government had long been seen as a blocker of access to information. Information is routinely not accessible unless you are known and liked by the person on the other end. As we launched our own data system, I realized that Oakland was not going to just open its data on a whim. It needed a big push: an open data policy.
I had never written an adopted policy in local government, so it was rather intimidating to begin such a task. Thanks to the work in New York and San Francisco, there were great policy and directive examples I could pull from the Code for America Commons site, which I reworked to suit Oakland (and also Alameda County). I then summarized the priority datasets from most other “open data” cities and drafted some guidelines for how a city could consider adopting and implementing an open data approach. I was now armed with reusable policies, good examples of how powerful opening data had been elsewhere, some clear steps and directions, and what I thought was a silver bullet use case. Where to now? I have never lobbied before, but if felt like that was the next step.
I met with many Oakland City Councilors and, sometimes, their Chief of Staff, to discuss this new thing that I was convinced mattered in our city. In these one-on-one discussions, I attempted to lay out the key issues, benefits, and the need for Oakland to do this. I also discussed the likely (modest) costs to implement this policy. Two reactions stood out in these conversations. First, I learned quickly, that my silver bullet was not viewed as very shiny, and second, I heard that our city councilors had a variety of problems that open data could help solve.
Discussing open data in every case led, if partly out of terminology confusion, to a discussion about technological struggles that the city faces. This included poor access to internal data, the benefits of open source technology, and the ways the city needed better ways to interact with the public. City councilors were frustrated with a lack of easy access to quality data on city assets and operations that made their job of developing informed, data-driven decisions much harder. These internal gains are not insignificant, and any advocates wanting to push for open data would do well to identify local examples that would meet the needs of government itself. After all, behavioral change is easiest when we can relate to a personal benefit. This was a similar experience when pushing Alameda County to consider open data. Developing a complex internal data-sharing infrastructure is incredibly expensive, slow, and frustrating, but opening data for the public is a quick win politically. It also provides fast access to new data for government agencies themselves, which is something that was not possible previously.
My local example of how open data can enable so much more than government can provide, afford, and imagine, was Crimespotting, one of the earliest and most impressive examples of civic hacking on almost open data. It was what I hoped would be our “silver bullet.” Built by a good friend, Michal Migurski, this site took a nightly Excel file from an FTP server and provided an elegant, usable interface that helped residents understand the recent crime activity in their community. At almost zero cost to the city, this was my story to demonstrate how awesome opening all the city’s data could be. I was wrong. While it resonated clearly with some city councilors, others actually hated the site, making my job much harder.
The reasons for not liking my “silver bullet” mostly centered on the fact that the site did not give them everything they wanted and that it provided information about unpleasant events that made our city look bad. The second concern is a tough argument to work with, but the first is an opportunity. It became clear that Crimespotting itself is not a bad use of data; it’s just that city councilors didn’t have good access to clear reporting, summary statistics, trend data, and custom reports for their own districts and police beats. This is a reflection on the lack of data analysts in the city and the limited capacity of certain city departments. It also highlights a trend of outsourcing “problems” to vendors. Vendors can create a system to do crime reporting and analysis, but they are not experts on the issues, so it’s hard for them to thoughtfully analyze and communicate the data in a local context.
This presents an opportunity for open data. If the city consistently opens their crime data, others can build the interfaces, tools, and reports that are needed for good policy-making. We’ve helped provide basic Excel files to the city for short-term help, but this need provided a clear use case for OpenOakland redeploying CrimeinChicago.org for use in Oakland: it meets a local need and leverages open data to show the potential in a way that resonates with local leaders.
After my first experiences doing something that resembled lobbying, one city councilor, Libby Schaaf, became the internal champion to make this happen. Unlike other cities, we did not get strong executive support to immediately implement this law. Instead, we had a resolution approved to “investigate open data platforms,” resulting in an approved plan and a contract with Socrata to provide such a platform.
開放資料 + #駭客松 = #開放政府 #ACApps 2013.1 挑戰於 4/27 在 #Berkeley 高中. http://code.acgov.org #gotcode?
Are We Opening Government?
This left Oakland in a strange position. We have both a community-driven open data platform and a city-supported platform, but we are one of the only cities to have a web portal and no legislation to support it (I recently learned that New Orleans is in the same position). To make this even stranger, Alameda County has done the exact same thing. They have the portal, but no policy to support or sustain it.
On one level, this is a wasted opportunity. It’s a rare and beautiful thing when both city/county staff and their elected leaders want the same thing. Both parties have a stake in this and have expressed serious support for open data, yet government staff doesn’t think legislation is needed and they are not pushing for it. Our elected officials have yet to follow through with legislation to ratify the use and adoption of open data in both the city and county.
There is an aspect to the open data movement that is not really about transparency. It’s not uncommon to find an elected official who isn’t enthused about the concept of open government: more transparency and, ultimately, more accountability. The transparency argument was not a convincing one for me locally. However, the promise of supporting innovation, making the city more accessible, and promoting new opportunities, along with better internal access to information, was an effective approach. While my pragmatic side is comfortable with a good decision for any particular reason, my idealistic side finds the positions of many officials unsettling and a reflection of the trend being identified by some as “openwashing.”
There is sometimes confusion that the adoption of an open data platform creates open government in and of itself. This is not the case—open data alone is not sufficient to create an open government.
The following message from an Alameda County government Twitter account (@ACData) on April 2, 2013, is an example of this flawed logic in action:
#OpenData + #Hackathon = #OpenGov #ACApps Challenge 2013.1 on 4/27 at #Berkeley High School. http://code.acgov.org #gotcode?
The line of reasoning is that we gave you some of our data (awesome), we want you to do stuff with it (nice, thank you), and hence, we now have Open Government (not quite).
Some role clarification is important here. The staff who are trying to open data and engage citizens are in fact moving toward a reality that embodies true open government. However, there are still bad apples within our local governments who are investigated for fraud, mismanagement, or corruption, or for hiding things from the public. Open data that includes a lot of noncontroversial data is low-hanging fruit and is important, but this is only one small piece of the puzzle that leads to open, accountable government. It’s a great starting point that takes minimal investment and leads to good publicity, but if we allow our local governments to paint the picture of this work meaning “we’re now open so leave us alone,” then we have failed them as much as they have failed to truly understand why open data matters.
There is a lesson here for many other cities and for Oakland. Publishing data is not the end game. It is a big deal though. Oakland is taking an easy road and requires increased advocacy to adopt a strong policy to sustain open data. By keeping elected officials more engaged through this process, we might have avoided this situation where we have a practice, but no policy—the opposite of almost every other city working with open data. The risk for us is that as soon as a senior city official doesn’t like something being open, it goes away. Take the city staff salary data, which was originally published but then removed. The words “Coming soon” were then published on the city’s earlier data page. This is a patently false statement because the reality is that the data was removed. The same data was still, however, available on the state controller’s website.
The Panacea of Data-Driven Cities
It’s hard to imagine a new policy, new social service, or new investment decision being made in any company or government without the strategic use of data to inform the thinking and planning. Still, too frequently, cities do not have staff with the skills or the mandate to thoughtfully analyze public and confidential data. Those of us in the private sector would be often horrified to see the type of information provided to city councilors to aid their decision-making. Since ninety percent of the world’s data has been created in the last two years, we have no excuse for not looking at reliable data to inform our planning and policy-making. This is the future we dreamed of, where data on almost any issue is readily available and easily analyzed. Only, we aren’t there yet.
Opening data in Oakland and Alameda County has raised a lot of questions about the quality and reliability of this data and with due cause. This is a valid fear of bureaucrats, yet it is a fear that has no rightful place in our governments. If our raw data is bad, our decisions can only be misinformed. Opening data, therefore, is in some respects the beginning of a data quality control effort within our local governments. Sunshine reveals many flaws, and open data reveals many flaws in our data collection, management, and use in city government. These realizations may make some people feel bad for a time, but the staffer who has been lamenting the lack of time and funding to properly manage the data in their department now has allies across their community who are also concerned about this lack of attention toward data management.
This has traditionally only been possible with very small, tight-knit groups of “experts” who work with government. These have generally been groups who would not push back hard on government for fear of losing favor and income streams. By opening our data, we can now take advantage of the larger pool of citizens who care about and know about that data; and we can learn from them and improve our processes and practices, which will both benefit the internal users of our public data and the wider public.
The problems that become visible around government data can often have ugly consequences, but they must be seen as growing pains as we move from immature uses of data in government to a place where data-driven decision-making is the norm.
就如之前所說，發布原始資料本身並不會創造知情者與專家的群體，也不等於讓解答垂手可得，而是讓對我們社區關心的議題作更深入的參與。Aaron Swartz在開放政府 (2010) 提到， 增進透明度的努力本身，並不是以現在的形式運作。開放資料的潛力，是讓更多響應與徹底監督政策與政府行動成為可能，理想的情況下，在未來政府公職人員會了解到他們是在一個不再隱形、無法以拒絕提供公開資料作為保護的空間中工作。這反而會創造出一個現實，讓數以千計的選民們可以取得並質疑能解釋這些公職人員的動機與行動的資料。
Leveraging the Long Tail of Government
Many critics in Oakland have suggested that open data doesn’t explain anything, doesn’t make anything clear, and doesn’t provide answers. Some also suggest that the community focus on open data and open government is overly focused on technological solutionism. The first group is right, albeit barely, while the second group has not fully comprehended this movement and its intent. Let’s take a look at a current practice in government and then consider what open data means for the future.
In Open Government (2010), David Eaves provides a cogent story that elegantly describes how citizen’s attitudes towards closed government decision-making have changed in the information age:
There was a time when citizens trusted objective professionals and elected officials to make those decisions on our behalf and where the opacity of the system was tolerated because of the professionalism and efficiencies it produced. This is no longer the case; the Internet accelerates the decline of deference because it accelerates the death of objectivity. It’s not that we don’t trust; it’s just that we want to verify. (Eaves, 2010.)
He goes on to compare Wikipedia and Britannica, where the authority that is transparent in its process is, in fact, more trusted. Eaves posits that “transparency... is the new objectivity. We are not going to trust objectivity unless we can see the discussion that led to it.”
In Oakland, open data would have saved the city from an embarrassing failure surrounding a new crime fighting strategy. It could also have spurred a much richer deliberative process to build a comprehensive approach for an issue, instead of a bad model created in closed access meetings. In 2012, the city announced a crime fighting strategy called the 100 Blocks Plan. Immediately, the community, my organization, and dozens of other organizations raised concerns over a serious lack of detail about this plan. We all questioned just exactly where these hundred blocks that contained ninety percent of the crime were. We met with city staff who looked over our initial analysis, which showed a very different reality than what the city had laid out in its plan. They confidently told us that their data was the same, which clearly was not the case. The city chose not to publish accurate information about a place-based strategy and refused to publish the data used to make this critical decision that affects the safety and well-being of our city.
At this point in time, crime reports were almost open data. Michal Migurski had collected years of data for Crimespotting, and the Urban Strategies Council had also cleaned and published even more of this data. When the official response did not ring true with our perception of good government (the model looked quite wrong and the planning process was secretive) in a city with dozens of organizations with analytical and crime prevention experience, we saw this as a failure to leverage the citizens and professionals who can contribute to public decision-making and planning.
In June 2012, we released our own study of Oakland crime hotspots. Our research indicated that at most, one hundred city blocks (and a buffer) could contain only seventeen percent of violent crimes—not the ninety percent figure publicized by the city. We were frustrated that at a time when other cities were publishing raw data to inform the public, along with quality analysis to help us understand their process, Oakland was doing the opposite. So, we attempted to lead by example. We published our study, including the raw data we used for our calculations, and a detailed methodology, so others could review our findings and correct us if we made serious mistakes (Urban Strategies Council, 2012). (We didn’t.) This revelation obviously caused a media frenzy that we had no desire to be involved in, but we did think it was valuable to have an informed discourse in our city about crime and city policies to reduce crime. After defending the official plan and numbers as correct, the city turned around and admitted that the data the plan was based on was, in fact, wrong.
The results of these unfortunate events were in no way intended to make any public officials look bad, but to elevate the level of engagement in public decision-making. We wanted to highlight the need for open data to allow the citizens of our city to understand the thinking behind city decisions—to test any statements of fact themselves. It is no longer an acceptable circumstance for local government to make decisions and ask that we simply trust its goodwill and expert opinion.
Oakland’s Mayor Quan told the media that she was at fault and should have vetted the data more. In this suggestion, I believe she was wrong. It is far from the role of a city mayor to conduct an independent review of every single analysis or metric given to them. Any elected official must be able to rely on the quality of analysis from city staff and other experts. What open datasets open up is a future where citizen experts can easily provide qualified perspectives on government decisions, analysis, and statements. This is a democracy that can support the role of citizens in active decision-making.
As I suggested earlier, publishing the raw data itself does not create an informed and expert community; it does not equate to answers being readily available. What it does do is enable far deeper engagement on issues that our communities care about. As Aaron Swartz submitted in Open Government (2010), transparency efforts themselves tend to not work in their current forms. The potential of open data is to enable far more responsive and thorough oversight of political and governmental actions, which ideally, could lead to a future where officials are operating in a space they know is no longer invisible and no longer protected by a layer of public records refusals. Instead, it would create a reality in which hundreds or thousands of their constituents can access and question data that explain their motives and actions.
一些組織像是Datakind, GAFFTA和無國界技客（Geeks Without Borders），還有一些當地研究/行動庫像都市策略協會，已經在各自的案例上努力了數十年。傳統上是這麼做的：定義問題、找出可以解答問題與提供解法的資料、取得資料、分析後再將結果與解法傳達出去。開放資料把痛苦從這個舊作法中拿掉，也把取得資料得侷限性去掉了，並提供無限的解決問題方式。我相信在未來會因需求而產生尖端的資料商店，就如同我們所作的一樣有辦法取得原始記錄格式的敏感資料來作研究，但開放資料聽起來是這種途徑或資料倉儲的喪鐘。非營利和學術機構同樣應該了解到，我們和囤積資料的人一樣有罪，跟著公部門走，將手上的資料盡可能的釋放出來，可以能達成更多事情。
What Has Data Done for Me Lately?
As the furious rush to build “innovative” and “game changing“ civic apps and new tools starts to plateau, I believe we are seeing a slow but steady shift into finding ways that this new treasure trove of open data can actually do something useful. By useful, I mean solve problems, uncover unknown problems, and help illuminate new solutions to old problems. I love geeky apps that make my already comfortable life even better, more connected, and more informed, but this is indeed just a way that new technology and data are empowering the empowered. I’ve seen data do so much more, and we are starting to see this use trend growing nationally and globally.
Groups such as DataKind, GAFFTA, and Geeks Without Borders, and local research/action tanks, like the Urban Strategies Council, have been doing this well—in our case, for decades. Traditionally, it looks like this: define your problem, identify data to inform the problem/solution, obtain data, analyze it, and communicate results and solutions. Open data takes the pain out of this old equation. It also takes the exclusivity out of the obtain data element and provides for unlimited problem solvers. I believe there will be a future need for sophisticated data shops like ours that can gain access to raw, record-level sensitive data for research purposes, but open data sounds the death knell for the gateway or custodian model of data warehousing. The nonprofit and academic sector has to also realize that we have been as guilty of data hoarding as anyone and that we can enable more by following the lead of the public sector and opening our data wherever we can.
On many urban research and spatial analysis projects, data acquisition can run as high as twenty percent of a budget. In just a few short months of working in Oakland with partially open data from the city and the county, we’ve already saved dozens of hours of time on two major projects. These saved costs to a community add up, especially in the case where researchers are working for the government on contract.
Working with already open data is a shift away from the typical model, where we have to charge local government for the time it takes us to source and uncover its own data to use for analysis. In the cases when we have to do our own data gathering, we should be making it open by default—otherwise, we ourselves are contributing to the problem of withholding valuable data that could be public. These nonprofit and academic institutions are often as protective and closed by nature as government has been, with the added obstacle of the lack of a public mandate due to being a taxpayer-funded entity. There have, however, been promising instances where foundations have begun opening their data to the world (DonorsChoose.org is one good example).
At Urban Strategies Council, we have been a national example in the adoption of an “open by default” policy for all the data we’ve held and all that we receive, but this also is a slow road since most nonprofit organizations severely lack data and the technological capacity for general operation, management, and publication of their data. When this does happen (and it must), we will see two major outcomes that are important in the social sector in particular: much more transparency in a sector that typically has little (Carl Malamud’s inspiring work to publish 990s does not yield measures on quality or efficiency of programs, unfortunately) and the familiar benefit of rich data resources being unlocked and available. Nonprofits, foundations, and universities do the bulk of community surveys in the USA, and many unknowingly duplicate each other’s work because the results are closed and protected. This results in the over-surveying of many communities and in wasteful efforts that would not be needed should raw survey results be published by default, along with the final reports.
In the present scenario, funders receive impact reports from grantees stating they served x people for y service, rarely providing any “where” or any long-term outcomes or impacts, merely demonstrating transactional gains through service delivery. Mandating or encouraging small to large nonprofits to begin opening detailed (but not confidential) data will allow funders to begin evaluating real impact. It will allow those who look at the macro picture to accurately identify gaps in actual service delivery and enable them to evaluate macro level outcomes to help guide future funding priorities. If you currently believe that this is common practice in the philanthropic sector, you couldn’t be more wrong. What started as an effort to get government to open up publicly funded data for a myriad of reasons will inevitably result in this same trend in the community development and social sector. We will require transparency over simple goodwill and flowery slogans, and we will push for evidence-based practice over “doing what we’ve always done because it works.”
我們這些作資料這一行的有一個毛病，就是我們必須很小心的面對充滿競爭的開放資料標準。很多組織過去需要使用不公開協議和合作備忘錄，但現在因為可以從網路上得到所需資料，這些文書就變得毫無意義。然而這還是產生一個模糊的中間地帶。一些過去會提供詳細的敏感資料給我們作研究用途的機關，不僅自行公開部分資料，同時也藉由公共檔案法（Public Records Act, PRA）或資訊自由法（FOIA）發展出更好的要求資料形式。這些使得開放資料和機密資料間的界線變得模糊，需要更小心的處理資料傳送許可。
Into the Danger Zone
One caveat that those of us in the data trade will have to work carefully around is competing standards of open data. Many organizations once required the use of non-disclosure agreements and memorandums of understanding, but these no longer have any meaning when we can now find the data we need online. There is, however, a tricky middle ground appearing. Agencies that once would furnish us with detailed, sensitive data for the purposes of research are both publishing some data openly, while at the same time developing better processes for data requests using the Public Records Act (PRA) or FOIA. This results in some blurring of the lines between open data and confidential data and will require very carefully communicated permissions.
Our local police department recently provided us with a rich homicide investigation dataset, which is something that we have accessed over the years. This time, however, it required a PRA. Our assumption that all records provided via PRA are public and, thus, we can republish this data, turned out to be partly wrong.
The department had only given us the data once more because of our trusted relationship as a community research partner. They did not, in fact, consider this sensitive data to be public. In the confusion over open data and new PRA procedures, however, they did issue the data in response to a PRA, hence, technically releasing the data as public record. This reflects the need to carefully and intentionally review data access procedures in every department housing sensitive information. Opening data provides an excellent opportunity to completely document your data holdings, legal availability, and metadata for each. This kind of attention is necessary to avoid confusion in the process and assumed permissions. In this case, it may be necessary to adopt a research access agreement similar to that used by school districts to ensure PRA and sensitive data are not released incorrectly.
社區營造是美國各城市中極少受開放資料影響的重要部分。這領域包括當地政府部門（規模通常很小）還有非營利組織，通常是社區發展會（Community Development Corporations，CDCs）。這兩種組織運作都受缺乏資料所影響，包括利益群體的顆粒資料^3、公有地產資料、市場趨勢、發展、法拍屋以及其他部門的資料。現在大部分的資料都鎖在政府倉庫裡，或是溢價出售給公共或其他政府單位。花費上的障礙，不清楚可否運用，以及大多數地產資料在品質、流通程度與相關度上的混淆，代表有太多的社區發展在運作上是資料盲（data-blind）。我們訪問許多這個領域的當地組織，發現這些障礙的各種面向，都影響了他們組織的效率和影響力。
There is one important sector of American cities that has barely been affected by open data: community development. This field consists of local government departments (often small ones) and local nonprofits, often Community Development Corporations (CDCs). Both of these types of organizations are hampered by a lack of access to granular data relating to their communities of interest, commonly property data, market trends, development, foreclosures, and data from other sectors. Presently, much of this data is locked in government silos or sold at a premium to the public and other government agencies, despite being public data at its core. The barriers of cost, unclear availability, and confusion over the quality, currency, and the relevance of most property data mean that too many CDC type operations are running data-blind. We’ve interviewed many local organizations in this field and found that almost every one faces these barriers in a way that affects their effectiveness and impact.
This sector must be data-driven, as the volume of investment it draws nationally is substantial. Decisions of where, when, and how to invest would rarely be made without solid data in the private sector, yet this is all too common in the CDC world. Opening key property and economic development data will add a level of sophistication and rigor to the CDC world that is important. However, it will not automatically create skills or cheap tools to analyze and utilize this data. As funders and government become more focused on evidence-based and data-driven efforts, both sources of investment must accept their role in supporting or providing that kind of capacity.
Who Owns Your Data?
Opening property data will bring a fight over public ownership that opening property data. Presently, in Alameda County, any nonprofit or any city/county agency that wishes to consider the impact of foreclosures on their work or to evaluate the impact or opportunities that foreclosures have created, must purchase this data from private sources. This means every agency, independently. The opening of some data should prompt us to ask about the realities behind other data not being opened. In this case, the source of this data is a county agency: the Clerk Recorder. The Clerk Recorder has a simple mandate that has not changed significantly for some time. When a foreclosure is filed, it comes in paper form. The date, bank or foreclosing agent, and the homeowner are electronically recorded, while other critical details, like amount, address, city, etc., are left on a scanned image. These images are made available to title companies, who provide the once-valuable service of creating digital records, which are then sold back to any government or public agency who needs them. In 2013, this cannot be accepted as good government.
This is not a flaw with the Clerk Recorder, who seems to be a genuinely helpful person based on our interactions. It’s a flaw in how we think about assets and resources and a lack of agility in government to adapt as opportunities arise. These corporate data fiefdoms should not survive the broader opening of public data, as people’s expectations rise and as government is encouraged to create value where it can. Creating usable data is one of the easiest ways to do so. Can you imagine not being able to answer a simple question like “How many foreclosures did you accept in the city of Oakland this year?” Because the agency itself creates no data, it cannot answer this question directly.
Are You Open for Business?
On the back of the benefits that community development can reap are the even more substantial rewards gained through increased economic development. This is simple in practice, but we’ve apparently been sleeping at the wheel in cities like Oakland. Our city desperately wants investment and retail, but we’ve failed to make the path smooth and to help those considering our city make informed decisions. For large corporations, access to local data is, perhaps, less of a barrier to investment because of their access to professional market analysts, brokers, and the like, but for a small to medium enterprise, these services are mostly out of reach.
It should be a no brainer for Oakland to both open its data and encourage the development of tools on top of this data. As of January 2013, in our city, a potential new business owner or investor could not find data or tools online to allow the owner to review business permits, building and development permits, vacant properties, blight, or regional crime comparisons. Compare this with our neighbor city of San Francisco where all these things are simply available. They’re available because they are needed and help make the path smoother for new business. When times are tight in local government, like during the past several years, we must get smarter. Releasing all this data is opportunistic and critical. If our city is unable to build the tools to help attract business because of funding or outdated IT procurement approaches, then the data will suffice at a marginal cost. Others can build tools more cheaply and faster. The old adage that says we can’t do this because it’s expensive is hard to use as a straw man anymore. This change will take leadership from our city to identify an area of internal weakness and engage with the broader community in an effort to develop the tools and analyses that this data makes possible. This would be an incredible demonstration of recognizing the potential in the long tail of government and in how open government can collectively do so much more together.
Every city takes a different path to open its data and progress toward open government. Oakland is off to a slow but exciting start with its data platform and with increased engagement through this data. Yet, it remains to be seen if our city will push through legislation to protect and sustain the worthy efforts of city staff at this point. Our lesson here is that engagement with elected officials must be sustained at a high level to ensure policy matches practice and also that developing strong initial resolutions is the key to avoid watered down plans and slow, uncertain paths forward.
Opening data is increasingly being seen as a single solution that will satisfy the transparency advocates. It is up to those of us who understand how much more is needed to speak truth to this misrepresentation of what open data is and is not. This relies on stronger ties with elected officials and behaviors more akin to community organizing efforts than those of tech startups. More open data provides us all with powerful fuel to demonstrate ways that open government can truly be more effective and more agile, but it will be largely left to those of us on the outside to demonstrate this and to encourage government to embrace open data more broadly.
While the app-developing world is an attractive audience to make use of new open data, there will be incredible gains in efficiency, decision-making, and planning in the community development, social service, and land management sectors that are just as impactful. Software developers are the focus for now, but in time, as this movement reaches the analysts, planners, and researchers who also live on data, this movement will come of age. Soon, more of the latter will experience the joy of responding to a complicated research data request with the phrase “Sure you can have the data. It’s open and online for free already!” We can all become enablers, part of a rich platform that creates value and shares for the benefit of all.
I’ve worked across dozens of cities in the USA and elsewhere, and for decades, the problem was always this: we can’t get the government data we need or it’s expensive. Enough cities have now demonstrated that this should not be the norm anymore. We enable far more value to be created once we become open-by-default cities: open for business, open for engagement, and open for innovation.
After writing, some positive progress has been made in Oakland. At the request of council member Libby Schaaf, we are beginning crowdsourcing of new legislation for an official open data policy in the city of Oakland. We’ve combined what we see as the strongest and most relevant elements of policies from Austin, Texas; Portland, Oregon; Raleigh, North Carolina; and Chicago. We’ve published the draft for public comment, and so far we have great feedback from other practitioners in cities with experience of their own process and locals interested in making this right for us. It’s an experiment. It should be fun. Next we hold a roundtable session to review and consider what this means for our city. And then we try to get this passed! Onward.
About the Author
Steve Spiker (Spike) is the Director of Research & Technology at the Urban Strategies Council, a social change nonprofit supporting innovation and collaboration based in Oakland for almost twenty-six years. He leads the Council’s research, spatial analysis, evaluation, and tech work. He is also the Executive Director and co-founder of OpenOakland, a civic innovation organization supporting open data and open government in the East Bay.
Eaves, D. (2010). Open Government. Available from https://github.com/oreillymedia/open_government New York Times (2012). The 45 Places to Go in 2012. The New York Times, January 6, 2012. Retrieved from http://travel.nytimes.com/2012/01/08/travel/45-places-to-go-in-2012.html?pagewanted=all&_r=0 Swartz, A. (2012, September 25). Fix the machine, not the person. Retrieved from http://www.aaronsw.com/weblog/nummi Urban Strategies Council. (2012, June 5). Our Take on Oakland’s 100 Blocks Plan. Retrieved from http://www.infoalamedacounty.org/index.php/research/crimesafety/violenceprevention/oakland100blocks.html Urban Strategies Council. (2012, May 22). Our Method: Oakland’s 100 Blocks Plan. Retrieved from http://www.infoalamedacounty.org/index.php/research/crimesafety/violenceprevention/oakland100blocksmethod.html
- 鐵鏽地帶（Rust belt）指美國東北部五大湖區沿岸，過去以鋼鐵等重要工業聞名的區域，隨著全球化生產中心轉移在1970年代開始逐沒落。包括芝加哥、水牛城、底特律、克里夫蘭等城市。
- 白人遷移（white flight）指歐裔高加索種人從多種族混居的市中心，逐漸移居種族組成單一的郊區或是從北方移居至氣候較溫和的南方地區（如佛羅里達）。