Here's a class I wrote to make importing and graphing my data a bit easier. It's inchoate right now, and highly proprietary. But gives some insight not only into how I have my data and analysis procedure structured, but how to deal with some of the rudimentary problems/issues that come up when working with a real world datafile. "Real world" should probably be in quotation marks back there, because these data are already highly sanitized, and I knew their structure going in (see https://datafromlanguage.wordpress.com/2015/12/14/cleaning-post-believing), but there are still additional steps in the code below. Without further ado:

In [22]:
class importation():
	""" The instance gives you access to .exp, which is
	a dictionary of the experiments, keyed on number (int).
    More importantly, it also gives .expcomp, which is a compilation
    (i.e. a list) of the experiment pairs - so 1 + 2 concatenated,
    3 + 4 concatenated, etc. Currently, you also get scatter_panel,
    the function I wrote to automate pyplot scatter plots for
	this dataset.
    
    Takes no arguments at the moment."""

	def __init__(self):
		self.exp = {}
        
        # We'll change each of these to numeric columns in a moment
        
		self.obcolumns = ['onset_dur',
                          'log_onset_dur',
                          'the_dur',
                          'log_the_dur',
                          'object_dur',
                          'log_object_dur',
                          'action_dur',
                          'log_action_dur'
                         ]
        
        # This will add a row to each dataset keyed to the experimental
        # condition. This is so the points in the scatter plots can be
        # represented in the RGB format that plt.scatter expects
        
		self.colordict = {'control' : 'b',
                          'identical' : 'r',
                          'related' : 'g',
                          'unrelated' : 'y'
                         }
        
        
        # This cycles through and prepares each experiment for inclusion
        # in the compilation, storing the finished experiment in self.exp
        # I toggled the output off, but it can be turned on to error-check.
        
		for x in range(1,9):
			os.chdir("G:\Praat_data\Exp%d" % x)
			#print(os.getcwd())
			file = open("dissexp%dr.txt" % x)
			pants = pd.read_table(file)
			try:
				pants['subject'] = pants['subject ']
				del(pants['subject '])
			except KeyError:
				print("This one was fine, boss")
			pants[self.obcolumns] = pants[self.obcolumns].apply(pd.to_numeric,errors = 'coerce')
			if x in [1,3,5,7]:
				pants.soa = -pants.soa
				#print("SOAs reversed for prime before target experiment %d", x)
			# This moves the x coordinate for SOA plotting a little bit, to make
            # the clusters easier to visualize
			pants['soa_jitter'] = pants.apply(lambda y : y.soa + np.random.randint(1,150),axis = 1)
			#print("Jitter applied")
			pants['condition_color'] = pants.apply(lambda z : self.colordict[z.condition],axis = 1)
			#print("Conditions RGB'd")
			#pants.columns
			#pants.dtypes
			self.exp[x] = pants
			#print(self.exp.keys())
			file.close()
		self.exp1 = pd.concat([self.exp[1],self.exp[2]])
		self.exp2 = pd.concat([self.exp[3],self.exp[4]])
		self.exp3 = pd.concat([self.exp[5],self.exp[6]])
		self.exp4 = pd.concat([self.exp[7],self.exp[8]])
		self.expcomp = [self.exp1, self.exp2, self.exp3, self.exp4]
		#print("Returning a list of concatenated experiments")
		
        # Finally, a collection of pyplot-friendly objects to feed
        # into the legend later
		red_patch = mpatches.Patch(color = 'red', label = "Identical", alpha = .5)
		yellow_patch = mpatches.Patch(color = 'yellow', label = "Unrelated", alpha = .5)
		blue_patch = mpatches.Patch(color = 'blue', label = 'Control', alpha = .5)
		green_patch = mpatches.Patch(color = 'green', label = 'Related', alpha = .5)
		self.patches = [red_patch, blue_patch, yellow_patch, green_patch]
	
	def scatter_panel(self, x, y, color = None, title = "Default Title", ylab = None, xlab = None):
		"""This produces a 2x2 panel-style graph of the various experiments
		In the dissertation.
		
		Args:
			x (str)
			y (str)
			color (str)
			title (Optional[str])
			xlab (Optional[str])
			ylab (Optional[str])"""
	# Takes two column names as strings and graphs all four experiment pairs' data for them
		self.f, self.ax = plt.subplots(2,2, sharex = True, sharey = True, figsize = (12,8))
		self.f.suptitle(title, fontsize = 20)
		self.ax[0,0].scatter(self.expcomp[0][x], self.expcomp[0][y], c = self.expcomp[0][color], alpha = .5)
		self.ax[0,0].set_title("Experiments 1 & 2", fontsize = 16)
		self.ax[0,0].set_ylabel(ylab, fontsize = 16)
		self.ax[0,0].tick_params(labelsize = 12)
		self.ax[0,1].scatter(self.expcomp[1][x], self.expcomp[1][y], c = self.expcomp[0][color], alpha = .5)
		self.ax[0,1].set_title("Experiments 3 & 4", fontsize = 16)
		self.ax[1,0].scatter(self.expcomp[2][x], self.expcomp[2][y], c = self.expcomp[0][color], alpha = .5)
		self.ax[1,0].set_ylabel(ylab, fontsize = 16)
		self.ax[1,0].set_xlabel(xlab, fontsize = 16)
		self.ax[1,0].tick_params(labelsize = 12)
		self.ax[1,0].set_title("Experiments 5 & 6", fontsize = 16)
		self.ax[1,1].scatter(self.expcomp[3][x], self.expcomp[3][y], c = self.expcomp[0][color], alpha = .5)
		self.ax[1,1].set_xlabel(xlab, fontsize = 16)
		self.ax[1,1].set_title("Experiments 7 & 8", fontsize = 16)
		self.ax[1,1].tick_params(labelsize = 12)
		plt.legend(bbox_to_anchor=(1, 1),
           bbox_transform=plt.gcf().transFigure,
		   handles = self.patches)
		plt.show()

I'll be using this throughout most of the posts, at least all of the ones I make about my dissertation data. I'll be adding to it as well, as I devise more functionality. You can also find the notebook itself posted on the Datasets part of the site.